Merge branch 'develop' into develop

main
Ben Busby 2020-10-07 18:38:36 -04:00 committed by GitHub
commit 2126742b76
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
22 changed files with 276 additions and 139 deletions

View File

@ -10,8 +10,5 @@ assignees: ''
**Describe the feature you'd like to see added** **Describe the feature you'd like to see added**
A short description of the feature, and what it would accomplish. A short description of the feature, and what it would accomplish.
**Describe which parts of the project this would modify (front end/back end/configuration/etc)**
A short description of which aspects of Whoogle Search would need modification
**Additional context** **Additional context**
Add any other context or screenshots about the feature request here. Add any other context or screenshots about the feature request here.

2
.replit Normal file
View File

@ -0,0 +1,2 @@
language = "python3"
run = "pip install -r requirements.txt && ./run"

View File

@ -5,6 +5,7 @@
[![Build Status](https://travis-ci.com/benbusby/whoogle-search.svg?branch=master)](https://travis-ci.com/benbusby/whoogle-search) [![Build Status](https://travis-ci.com/benbusby/whoogle-search.svg?branch=master)](https://travis-ci.com/benbusby/whoogle-search)
[![codebeat badge](https://codebeat.co/badges/e96cada2-fb6f-4528-8285-7d72abd74e8d)](https://codebeat.co/projects/github-com-benbusby-shoogle-master) [![codebeat badge](https://codebeat.co/badges/e96cada2-fb6f-4528-8285-7d72abd74e8d)](https://codebeat.co/projects/github-com-benbusby-shoogle-master)
[![Docker Pulls](https://img.shields.io/docker/pulls/benbusby/whoogle-search)](https://hub.docker.com/r/benbusby/whoogle-search) [![Docker Pulls](https://img.shields.io/docker/pulls/benbusby/whoogle-search)](https://hub.docker.com/r/benbusby/whoogle-search)
[![Gitter](https://img.shields.io/gitter/room/benbusby/whoogle-search)](https://gitter.im/whoogle-search/community)
Get Google search results, but without any ads, javascript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile. Get Google search results, but without any ads, javascript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile.
@ -21,7 +22,7 @@ Contents
- No ads or sponsored content - No ads or sponsored content
- No javascript - No javascript
- No cookies - No cookies
- No tracking/linking of your personal IP address - No tracking/linking of your personal IP address\*
- No AMP links - No AMP links
- No URL tracking tags (i.e. utm=%s) - No URL tracking tags (i.e. utm=%s)
- No referrer header - No referrer header
@ -34,6 +35,8 @@ Contents
- Optional location-based searching (i.e. results near \<city\>) - Optional location-based searching (i.e. results near \<city\>)
- Optional NoJS mode to disable all Javascript in results - Optional NoJS mode to disable all Javascript in results
<sup>*If deployed to a remote server</sup>
## Dependencies ## Dependencies
If using Heroku Quick Deploy, **you can skip this section**. If using Heroku Quick Deploy, **you can skip this section**.
@ -55,19 +58,28 @@ There are a few different ways to begin using the app, depending on your prefere
Provides: Provides:
- Free deployment of app - Free deployment of app
- Free https url (https://\<your app name\>.herokuapp.com) - Free HTTPS url (https://\<your app name\>.herokuapp.com)
- Downtime after periods of inactivity \([solution](https://github.com/benbusby/whoogle-search#prevent-downtime-heroku-only)\) - Downtime after periods of inactivity \([solution](https://github.com/benbusby/whoogle-search#prevent-downtime-heroku-only)\)
### B) [pipx](https://github.com/pipxproject/pipx#install-pipx) ### B) [Repl.it](https://repl.it)
[![Run on Repl.it](https://repl.it/badge/github/benbusby/whoogle-search)](https://repl.it/github/benbusby/whoogle-search)
Provides:
- Free deployment of app (can be ran without account)
- Free HTTPS url (https://\<app name\>.\<username\>\.repl\.co)
- Supports custom domains
- Downtime after periods of inactivity \([solution 1](https://repl.it/talk/ask/use-this-pingmat1replco-just-enter/28821/101298), [solution 2](https://repl.it/talk/learn/How-to-use-and-setup-UptimeRobot/9003)\)
### C) [pipx](https://github.com/pipxproject/pipx#install-pipx)
Persistent install: Persistent install:
`pipx install git+https://github.com/benbusby/whoogle-search.git` `pipx install git+https://github.com/benbusby/whoogle-search.git`
Sandboxed temporary instance: Sandboxed temporary instance:
`pipx run git+https://github.com/benbusby/whoogle-search.git whoogle-search` `pipx run --spec git+https://github.com/benbusby/whoogle-search.git whoogle-search`
### C) pip ### D) pip
`pip install whoogle-search` `pip install whoogle-search`
```bash ```bash
@ -85,7 +97,7 @@ optional arguments:
--https-only Enforces HTTPS redirects for all requests (default False) --https-only Enforces HTTPS redirects for all requests (default False)
``` ```
### D) Manual ### E) Manual
Clone the repo and run the following commands to start the app in a local-only environment: Clone the repo and run the following commands to start the app in a local-only environment:
```bash ```bash
@ -124,7 +136,7 @@ sudo systemctl enable whoogle
sudo systemctl start whoogle sudo systemctl start whoogle
``` ```
### E) Manual (Docker) ### F) Manual (Docker)
1. Ensure the Docker daemon is running, and is accessible by your user account 1. Ensure the Docker daemon is running, and is accessible by your user account
- To add user permissions, you can execute `sudo usermod -aG docker yourusername` - To add user permissions, you can execute `sudo usermod -aG docker yourusername`
- Running `docker ps` should return something besides an error. If you encounter an error saying the daemon isn't running, try `sudo systemctl start docker` (Linux) or ensure the docker tool is running (Windows/macOS). - Running `docker ps` should return something besides an error. If you encounter an error saying the daemon isn't running, try `sudo systemctl start docker` (Linux) or ensure the docker tool is running (Windows/macOS).
@ -194,15 +206,23 @@ Update browser settings:
- Firefox (iOS) - Firefox (iOS)
- In the mobile app Settings page, tap "Search" within the "General" section. There should be an option titled "Add Search Engine" to select. It should prompt you to enter a title and search query url - use the following elements to fill out the form: - In the mobile app Settings page, tap "Search" within the "General" section. There should be an option titled "Add Search Engine" to select. It should prompt you to enter a title and search query url - use the following elements to fill out the form:
- Title: "Whoogle" - Title: "Whoogle"
- URL: "https://\<your whoogle url\>/search?q=%s" - URL: `http[s]://\<your whoogle url\>/search?q=%s`
- Firefox (Android) - Firefox (Android)
- Version <79.0.0
- Navigate to your app's url - Navigate to your app's url
- Long-press on the search text field - Long-press on the search text field
- Click the "Add Search Engine" menu item - Click the "Add Search Engine" menu item
- Select a name and click ok - Select a name and click ok
- Click the 3 dot menu in the top right - Click the 3 dot menu in the top right
- Navigate to the settings menu and select the "search" sub-menu - Navigate to the settings menu and select the "Search" sub-menu
- Select Whoogle and press "Set as default" - Select Whoogle and press "Set as default"
- Version >=79.0.0
- Click the 3 dot menu in the top right
- Navigate to the settings menu and select the "Search" sub-menu
- Click "Add search engine"
- Select the 'Other' radio button
- Name: "Whoogle"
- Search string to use: `https://\<your whoogle url\>/search?q=%s`
- [Alfred](https://www.alfredapp.com/) (Mac OS X) - [Alfred](https://www.alfredapp.com/) (Mac OS X)
1. Go to `Alfred Preferences` > `Features` > `Web Search` and click `Add Custom Search`. Then configure these settings 1. Go to `Alfred Preferences` > `Features` > `Web Search` and click `Add Custom Search`. Then configure these settings
- Search URL: `https://\<your whoogle url\>/search?q={query} - Search URL: `https://\<your whoogle url\>/search?q={query}

View File

@ -1,4 +1,4 @@
from app.utils.misc import generate_user_keys from app.utils.session_utils import generate_user_keys
from flask import Flask from flask import Flask
from flask_session import Session from flask_session import Session
import os import os
@ -9,7 +9,7 @@ app.default_key_set = generate_user_keys()
app.no_cookie_ips = [] app.no_cookie_ips = []
app.config['SECRET_KEY'] = os.urandom(32) app.config['SECRET_KEY'] = os.urandom(32)
app.config['SESSION_TYPE'] = 'filesystem' app.config['SESSION_TYPE'] = 'filesystem'
app.config['VERSION_NUMBER'] = '0.2.0' app.config['VERSION_NUMBER'] = '0.2.1'
app.config['APP_ROOT'] = os.getenv('APP_ROOT', os.path.dirname(os.path.abspath(__file__))) app.config['APP_ROOT'] = os.getenv('APP_ROOT', os.path.dirname(os.path.abspath(__file__)))
app.config['STATIC_FOLDER'] = os.getenv('STATIC_FOLDER', os.path.join(app.config['APP_ROOT'], 'static')) app.config['STATIC_FOLDER'] = os.getenv('STATIC_FOLDER', os.path.join(app.config['APP_ROOT'], 'static'))
app.config['CONFIG_PATH'] = os.getenv('CONFIG_VOLUME', os.path.join(app.config['STATIC_FOLDER'], 'config')) app.config['CONFIG_PATH'] = os.getenv('CONFIG_VOLUME', os.path.join(app.config['STATIC_FOLDER'], 'config'))

View File

@ -1,56 +1,11 @@
from app.request import VALID_PARAMS from app.request import VALID_PARAMS
from app.utils.misc import BLACKLIST from app.utils.filter_utils import *
from bs4 import BeautifulSoup
from bs4.element import ResultSet from bs4.element import ResultSet
from cryptography.fernet import Fernet from cryptography.fernet import Fernet
import re import re
import urllib.parse as urlparse import urllib.parse as urlparse
from urllib.parse import parse_qs from urllib.parse import parse_qs
SKIP_ARGS = ['ref_src', 'utm']
FULL_RES_IMG = '<br/><a href="{}">Full Image</a>'
GOOG_IMG = '/images/branding/searchlogo/1x/googlelogo'
LOGO_URL = GOOG_IMG + '_desk'
BLANK_B64 = '''

'''
def get_first_link(soup):
# Replace hrefs with only the intended destination (no "utm" type tags)
for a in soup.find_all('a', href=True):
# Return the first search result URL
if 'url?q=' in a['href']:
return filter_link_args(a['href'])
def filter_link_args(query_link):
parsed_link = urlparse.urlparse(query_link)
link_args = parse_qs(parsed_link.query)
safe_args = {}
if len(link_args) == 0 and len(parsed_link) > 0:
return query_link
for arg in link_args.keys():
if arg in SKIP_ARGS:
continue
safe_args[arg] = link_args[arg]
# Remove original link query and replace with filtered args
query_link = query_link.replace(parsed_link.query, '')
if len(safe_args) > 0:
query_link = query_link + urlparse.urlencode(safe_args, doseq=True)
else:
query_link = query_link.replace('?', '')
return query_link
def has_ad_content(element: str):
return element.upper() in (value.upper() for value in BLACKLIST) or '' in element
class Filter: class Filter:
def __init__(self, user_keys: dict, mobile=False, config=None): def __init__(self, user_keys: dict, mobile=False, config=None):
@ -61,6 +16,7 @@ class Filter:
self.dark = config['dark'] if 'dark' in config else False self.dark = config['dark'] if 'dark' in config else False
self.nojs = config['nojs'] if 'nojs' in config else False self.nojs = config['nojs'] if 'nojs' in config else False
self.new_tab = config['new_tab'] if 'new_tab' in config else False self.new_tab = config['new_tab'] if 'new_tab' in config else False
self.alt_redirect = config['alts'] if 'alts' in config else False
self.mobile = mobile self.mobile = mobile
self.user_keys = user_keys self.user_keys = user_keys
self.main_divs = ResultSet('') self.main_divs = ResultSet('')
@ -188,18 +144,6 @@ class Filter:
except AttributeError: except AttributeError:
pass pass
# Set up dark mode if active
if self.dark:
soup.find('html')['style'] = 'scrollbar-color: #333 #111;color:#fff !important;background:#000 !important'
for input_element in soup.findAll('input'):
input_element['style'] = 'color:#fff;background:#000;'
for span_element in soup.findAll('span'):
span_element['style'] = 'color: white;'
for href_element in soup.findAll('a'):
href_element['style'] = 'color: white' if href_element['href'].startswith('/search') else ''
def update_link(self, link): def update_link(self, link):
# Replace href with only the intended destination (no "utm" type tags) # Replace href with only the intended destination (no "utm" type tags)
href = link['href'].replace('https://www.google.com', '') href = link['href'].replace('https://www.google.com', '')
@ -213,8 +157,12 @@ class Filter:
query_link = parse_qs(result_link.query)['q'][0] if '?q=' in href else '' query_link = parse_qs(result_link.query)['q'][0] if '?q=' in href else ''
if query_link.startswith('/'): if query_link.startswith('/'):
# Internal google links (i.e. mail, maps, etc) should still be forwarded to Google
link['href'] = 'https://google.com' + query_link link['href'] = 'https://google.com' + query_link
elif '/search?q=' in href: elif '/search?q=' in href:
# "li:1" implies the query should be interpreted verbatim, so we wrap it in double quotes
if 'li:1' in href:
query_link = '"' + query_link + '"'
new_search = '/search?q=' + self.encrypt_path(query_link) new_search = '/search?q=' + self.encrypt_path(query_link)
query_params = parse_qs(urlparse.urlparse(href).query) query_params = parse_qs(urlparse.urlparse(href).query)
@ -232,11 +180,13 @@ class Filter:
else: else:
link['href'] = href link['href'] = href
# Replace link location if "alts" config is enabled
if self.alt_redirect:
# Search and replace all link descriptions with alternative location
link['href'] = get_site_alt(link['href'])
link_desc = link.find_all(text=re.compile('|'.join(SITE_ALTS.keys())))
if len(link_desc) == 0:
return
def gen_nojs(sibling): # Replace link destination
nojs_link = BeautifulSoup().new_tag('a') link_desc[0].replace_with(get_site_alt(link_desc[0]))
nojs_link['href'] = '/window?location=' + sibling['href']
nojs_link['style'] = 'display:block;width:100%;'
nojs_link.string = 'NoJS Link: ' + nojs_link['href']
sibling.append(BeautifulSoup('<br><hr><br>', 'html.parser'))
sibling.append(nojs_link)

View File

@ -2,7 +2,7 @@ class Config:
# Derived from here: # Derived from here:
# https://sites.google.com/site/tomihasa/google-language-codes#searchlanguage # https://sites.google.com/site/tomihasa/google-language-codes#searchlanguage
LANGUAGES = [ LANGUAGES = [
{'name': 'Default (use server location)', 'value': ''}, {'name': 'Default (none specified)', 'value': ''},
{'name': 'English', 'value': 'lang_en'}, {'name': 'English', 'value': 'lang_en'},
{'name': 'Afrikaans', 'value': 'lang_af'}, {'name': 'Afrikaans', 'value': 'lang_af'},
{'name': 'Arabic', 'value': 'lang_ar'}, {'name': 'Arabic', 'value': 'lang_ar'},
@ -52,7 +52,7 @@ class Config:
] ]
COUNTRIES = [ COUNTRIES = [
{'name': 'Default (use server location)', 'value': ''}, {'name': 'Default (none)', 'value': ''},
{'name': 'Afghanistan', 'value': 'countryAF'}, {'name': 'Afghanistan', 'value': 'countryAF'},
{'name': 'Albania', 'value': 'countryAL'}, {'name': 'Albania', 'value': 'countryAL'},
{'name': 'Algeria', 'value': 'countryDZ'}, {'name': 'Algeria', 'value': 'countryDZ'},
@ -306,6 +306,7 @@ class Config:
self.dark = False self.dark = False
self.nojs = False self.nojs = False
self.near = '' self.near = ''
self.alts = False
self.new_tab = False self.new_tab = False
self.get_only = False self.get_only = False

View File

@ -12,7 +12,7 @@ MOBILE_UA = '{}/5.0 (Android 0; Mobile; rv:54.0) Gecko/54.0 {}/59.0'
DESKTOP_UA = '{}/5.0 (X11; {} x86_64; rv:75.0) Gecko/20100101 {}/75.0' DESKTOP_UA = '{}/5.0 (X11; {} x86_64; rv:75.0) Gecko/20100101 {}/75.0'
# Valid query params # Valid query params
VALID_PARAMS = ['tbs', 'tbm', 'start', 'near', 'source'] VALID_PARAMS = ['tbs', 'tbm', 'start', 'near', 'source', 'nfpr']
def gen_user_agent(is_mobile): def gen_user_agent(is_mobile):
@ -68,6 +68,10 @@ def gen_query(query, args, config, near_city=None):
else: else:
param_dict['lr'] = ('&lr=' + config.lang_search) if config.lang_search else '' param_dict['lr'] = ('&lr=' + config.lang_search) if config.lang_search else ''
# Set autocorrected search ignore
if 'nfpr' in args:
param_dict['nfpr'] = '&nfpr=' + args.get('nfpr')
param_dict['cr'] = ('&cr=' + config.ctry) if config.ctry else '' param_dict['cr'] = ('&cr=' + config.ctry) if config.ctry else ''
param_dict['hl'] = ('&hl=' + config.lang_interface.replace('lang_', '')) if config.lang_interface else '' param_dict['hl'] = ('&hl=' + config.lang_interface.replace('lang_', '')) if config.lang_interface else ''
param_dict['safe'] = '&safe=' + ('active' if config.safe else 'off') param_dict['safe'] = '&safe=' + ('active' if config.safe else 'off')

View File

@ -15,7 +15,7 @@ from requests import exceptions
from app import app from app import app
from app.models.config import Config from app.models.config import Config
from app.request import Request from app.request import Request
from app.utils.misc import valid_user_session from app.utils.session_utils import valid_user_session
from app.utils.routing_utils import * from app.utils.routing_utils import *
@ -115,12 +115,11 @@ def opensearch():
if opensearch_url.endswith('/'): if opensearch_url.endswith('/'):
opensearch_url = opensearch_url[:-1] opensearch_url = opensearch_url[:-1]
template = render_template('opensearch.xml', return render_template(
'opensearch.xml',
main_url=opensearch_url, main_url=opensearch_url,
request_type='get' if g.user_config.get_only else 'post') request_type='' if g.user_config.get_only else 'method="post"'
response = make_response(template) ), 200, {'Content-Disposition': 'attachment; filename="opensearch.xml"'}
response.headers['Content-Type'] = 'application/xml'
return response
@app.route('/autocomplete', methods=['GET', 'POST']) @app.route('/autocomplete', methods=['GET', 'POST'])

View File

@ -0,0 +1,42 @@
html {
background-color: #000 !important;
}
body {
background-color: #222 !important;
}
div {
/*background-color: #111 !important;*/
color: #fff !important;
}
a:visited h3 div {
color: #bbbbff !important;
}
a:link h3 div {
color: #4b8eea !important;
}
a:link div {
color: #aaffaa !important;
}
div span {
color: #bbb !important;
}
input {
background-color: #111 !important;
color: #fff !important;
}
#search-bar {
color: #fff !important;
background-color: #000 !important;
}
.search-container {
background-color: #000 !important;
}

View File

@ -16,6 +16,7 @@ body {
left: 50%; left: 50%;
transform: translate(-50%, -50%); transform: translate(-50%, -50%);
max-width: 600px; max-width: 600px;
z-index: 15;
} }
.search-items { .search-items {
@ -34,10 +35,10 @@ body {
color: #685e79; color: #685e79;
border-radius: 10px 10px 0 0; border-radius: 10px 10px 0 0;
max-width: 600px; max-width: 600px;
background: rgba(0,0,0,0); background: rgba(0, 0, 0, 0);
} }
#search-bar:focus{ #search-bar:focus {
color: #685e79; color: #685e79;
} }
@ -45,7 +46,7 @@ body {
width: 100%; width: 100%;
height: 40px; height: 40px;
border: 1px solid #685e79; border: 1px solid #685e79;
background: #685e79; background: #685e79 !important;
text-align: center; text-align: center;
color: #fff; color: #fff;
cursor: pointer; cursor: pointer;
@ -68,7 +69,7 @@ button::-moz-focus-inner {
.collapsible { .collapsible {
outline: 0; outline: 0;
background-color: rgba(0,0,0,0); background-color: rgba(0, 0, 0, 0);
color: #685e79; color: #685e79;
cursor: pointer; cursor: pointer;
padding: 18px; padding: 18px;
@ -127,5 +128,10 @@ footer {
bottom: 0%; bottom: 0%;
text-align: center; text-align: center;
width: 100%; width: 100%;
z-index: -1; z-index: 10;
}
.info-text {
font-style: italic;
font-size: 12px;
} }

View File

@ -2,7 +2,7 @@ const handleUserInput = searchBar => {
let xhrRequest = new XMLHttpRequest(); let xhrRequest = new XMLHttpRequest();
xhrRequest.open("POST", "/autocomplete"); xhrRequest.open("POST", "/autocomplete");
xhrRequest.setRequestHeader("Content-type", "application/x-www-form-urlencoded"); xhrRequest.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
xhrRequest.onload = function() { xhrRequest.onload = function () {
if (xhrRequest.readyState === 4 && xhrRequest.status !== 200) { if (xhrRequest.readyState === 4 && xhrRequest.status !== 200) {
// Do nothing if failed to fetch autocomplete results // Do nothing if failed to fetch autocomplete results
return; return;
@ -18,6 +18,7 @@ const handleUserInput = searchBar => {
const autocomplete = (searchInput, autocompleteResults) => { const autocomplete = (searchInput, autocompleteResults) => {
let currentFocus; let currentFocus;
let originalSearch;
searchInput.addEventListener("input", function () { searchInput.addEventListener("input", function () {
let autocompleteList, autocompleteItem, i, val = this.value; let autocompleteList, autocompleteItem, i, val = this.value;
@ -53,9 +54,11 @@ const autocomplete = (searchInput, autocompleteResults) => {
let suggestion = document.getElementById(this.id + "-autocomplete-list"); let suggestion = document.getElementById(this.id + "-autocomplete-list");
if (suggestion) suggestion = suggestion.getElementsByTagName("div"); if (suggestion) suggestion = suggestion.getElementsByTagName("div");
if (e.keyCode === 40) { // down if (e.keyCode === 40) { // down
e.preventDefault();
currentFocus++; currentFocus++;
addActive(suggestion); addActive(suggestion);
} else if (e.keyCode === 38) { //up } else if (e.keyCode === 38) { //up
e.preventDefault();
currentFocus--; currentFocus--;
addActive(suggestion); addActive(suggestion);
} else if (e.keyCode === 13) { // enter } else if (e.keyCode === 13) { // enter
@ -63,17 +66,36 @@ const autocomplete = (searchInput, autocompleteResults) => {
if (currentFocus > -1) { if (currentFocus > -1) {
if (suggestion) suggestion[currentFocus].click(); if (suggestion) suggestion[currentFocus].click();
} }
} else {
originalSearch = document.getElementById("search-bar").value;
} }
}); });
const addActive = suggestion => { const addActive = suggestion => {
if (!suggestion || !suggestion[currentFocus]) return false; let searchBar = document.getElementById("search-bar");
// Handle navigation outside of suggestion list
if (!suggestion || !suggestion[currentFocus]) {
if (currentFocus >= suggestion.length) {
// Move selection back to the beginning
currentFocus = 0;
} else if (currentFocus < 0) {
// Retrieve original search and remove active suggestion selection
currentFocus = -1;
searchBar.value = originalSearch;
removeActive(suggestion); removeActive(suggestion);
return;
} else {
return;
}
}
if (currentFocus >= suggestion.length) currentFocus = 0; removeActive(suggestion);
if (currentFocus < 0) currentFocus = (suggestion.length - 1);
suggestion[currentFocus].classList.add("autocomplete-active"); suggestion[currentFocus].classList.add("autocomplete-active");
// Autofill search bar with suggestion content
searchBar.value = suggestion[currentFocus].textContent;
searchBar.focus();
}; };
const removeActive = suggestion => { const removeActive = suggestion => {

View File

@ -1,3 +1,14 @@
// Whoogle configurations that use boolean values and checkboxes
CONFIG_BOOLS = [
"nojs", "dark", "safe", "alts", "new_tab", "get_only"
];
// Whoogle configurations that use string values and input fields
CONFIG_STRS = [
"near", "url"
];
const setupSearchLayout = () => { const setupSearchLayout = () => {
// Setup search field // Setup search field
const searchBar = document.getElementById("search-bar"); const searchBar = document.getElementById("search-bar");
@ -18,15 +29,6 @@ const setupSearchLayout = () => {
}; };
const fillConfigValues = () => { const fillConfigValues = () => {
// Establish all config value elements
const near = document.getElementById("config-near");
const noJS = document.getElementById("config-nojs");
const dark = document.getElementById("config-dark");
const safe = document.getElementById("config-safe");
const url = document.getElementById("config-url");
const newTab = document.getElementById("config-new-tab");
const getOnly = document.getElementById("config-get-only");
// Request existing config info // Request existing config info
let xhrGET = new XMLHttpRequest(); let xhrGET = new XMLHttpRequest();
xhrGET.open("GET", "/config"); xhrGET.open("GET", "/config");
@ -39,15 +41,15 @@ const fillConfigValues = () => {
// Allow for updating/saving config values // Allow for updating/saving config values
let configSettings = JSON.parse(xhrGET.responseText); let configSettings = JSON.parse(xhrGET.responseText);
near.value = configSettings["near"] ? configSettings["near"] : ""; CONFIG_STRS.forEach(function(item) {
noJS.checked = !!configSettings["nojs"]; let configElement = document.getElementById("config-" + item.replace("_", "-"));
dark.checked = !!configSettings["dark"]; configElement.value = configSettings[item] ? configSettings[item] : "";
safe.checked = !!configSettings["safe"]; });
getOnly.checked = !!configSettings["get_only"];
newTab.checked = !!configSettings["new_tab"];
// Addresses the issue of incorrect URL being used behind reverse proxy CONFIG_BOOLS.forEach(function(item) {
url.value = configSettings["url"] ? configSettings["url"] : ""; let configElement = document.getElementById("config-" + item.replace("_", "-"));
configElement.checked = !!configSettings[item];
});
}; };
xhrGET.send(); xhrGET.send();
@ -113,4 +115,8 @@ document.addEventListener("DOMContentLoaded", function() {
setupSearchLayout(); setupSearchLayout();
setupConfigLayout(); setupConfigLayout();
// Focusing on the search input field requires a delay for elements to finish
// loading (seemingly only on FF)
setTimeout(function() { document.getElementById("search-bar").focus(); }, 250);
}); });

View File

@ -8,6 +8,9 @@
<script type="text/javascript" src="/static/js/autocomplete.js"></script> <script type="text/javascript" src="/static/js/autocomplete.js"></script>
<link rel="stylesheet" href="/static/css/{{ 'search-dark' if dark_mode else 'search' }}.css"> <link rel="stylesheet" href="/static/css/{{ 'search-dark' if dark_mode else 'search' }}.css">
<link rel="stylesheet" href="/static/css/header.css"> <link rel="stylesheet" href="/static/css/header.css">
{% if dark_mode %}
<link rel="stylesheet" href="/static/css/dark-theme.css"/>
{% endif %}
<title>{{ query }} - Whoogle Search</title> <title>{{ query }} - Whoogle Search</title>
</head> </head>
<body> <body>

View File

@ -23,6 +23,9 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="/static/css/{{ 'search-dark' if config.dark else 'search' }}.css"> <link rel="stylesheet" href="/static/css/{{ 'search-dark' if config.dark else 'search' }}.css">
<link rel="stylesheet" href="/static/css/main.css"> <link rel="stylesheet" href="/static/css/main.css">
{% if config.dark %}
<link rel="stylesheet" href="/static/css/dark-theme.css"/>
{% endif %}
<title>Whoogle Search</title> <title>Whoogle Search</title>
</head> </head>
<body id="main" style="display: none; background-color: {{ '#000' if config.dark else '#fff' }}"> <body id="main" style="display: none; background-color: {{ '#000' if config.dark else '#fff' }}">
@ -31,7 +34,7 @@
<form id="search-form" action="/search" method="{{ 'get' if config.get_only else 'post' }}"> <form id="search-form" action="/search" method="{{ 'get' if config.get_only else 'post' }}">
<div class="search-fields"> <div class="search-fields">
<div class="autocomplete"> <div class="autocomplete">
<input type="text" name="q" id="search-bar" autofocus="autofocus"> <input type="text" name="q" id="search-bar" autofocus="autofocus" autocomplete="off">
</div> </div>
<input type="submit" id="search-submit" value="Search"> <input type="submit" id="search-submit" value="Search">
</div> </div>
@ -42,7 +45,7 @@
<div class="config-fields"> <div class="config-fields">
<form id="config-form" action="/config" method="post"> <form id="config-form" action="/config" method="post">
<div class="config-div"> <div class="config-div">
<label for="config-ctry">Country: </label> <label for="config-ctry">Filter Results by Country: </label>
<select name="ctry" id="config-ctry"> <select name="ctry" id="config-ctry">
{% for ctry in countries %} {% for ctry in countries %}
<option value="{{ ctry.value }}" <option value="{{ ctry.value }}"
@ -53,6 +56,7 @@
</option> </option>
{% endfor %} {% endfor %}
</select> </select>
<div><span class="info-text"> — Note: If enabled, a website will only appear in the results if it is *hosted* in the selected country.</span></div>
</div> </div>
<div class="config-div"> <div class="config-div">
<label for="config-lang-interface">Interface Language: </label> <label for="config-lang-interface">Interface Language: </label>
@ -96,6 +100,12 @@
<label for="config-safe">Safe Search: </label> <label for="config-safe">Safe Search: </label>
<input type="checkbox" name="safe" id="config-safe"> <input type="checkbox" name="safe" id="config-safe">
</div> </div>
<div class="config-div">
<label class="tooltip" for="config-alts">Replace Social Media Links: </label>
<input type="checkbox" name="alts" id="config-alts">
<div><span class="info-text"> — Replaces Twitter/YouTube/Instagram links
with Nitter/Invidious/Bibliogram links.</span></div>
</div>
<div class="config-div"> <div class="config-div">
<label for="config-new-tab">Open Links in New Tab: </label> <label for="config-new-tab">Open Links in New Tab: </label>
<input type="checkbox" name="new_tab" id="config-new-tab"> <input type="checkbox" name="new_tab" id="config-new-tab">

File diff suppressed because one or more lines are too long

79
app/utils/filter_utils.py Normal file
View File

@ -0,0 +1,79 @@
from bs4 import BeautifulSoup
import urllib.parse as urlparse
from urllib.parse import parse_qs
SKIP_ARGS = ['ref_src', 'utm']
FULL_RES_IMG = '<br/><a href="{}">Full Image</a>'
GOOG_IMG = '/images/branding/searchlogo/1x/googlelogo'
LOGO_URL = GOOG_IMG + '_desk'
BLANK_B64 = '''

'''
BLACKLIST = [
'ad', 'anuncio', 'annuncio', 'annonce', 'Anzeige', '广告', '廣告', 'Reklama', 'Реклама', 'Anunț', '광고',
'annons', 'Annonse', 'Iklan', '広告', 'Augl.', 'Mainos', 'Advertentie', 'إعلان', 'Գովազդ', 'विज्ञापन', 'Reklam',
'آگهی', 'Reklāma', 'Reklaam', 'Διαφήμιση', 'מודעה', 'Hirdetés'
]
SITE_ALTS = {
'twitter.com': 'nitter.net',
'youtube.com': 'invidiou.site',
'instagram.com': 'bibliogram.art/u'
}
def has_ad_content(element: str):
return element.upper() in (value.upper() for value in BLACKLIST) or '' in element
def get_first_link(soup):
# Replace hrefs with only the intended destination (no "utm" type tags)
for a in soup.find_all('a', href=True):
# Return the first search result URL
if 'url?q=' in a['href']:
return filter_link_args(a['href'])
def get_site_alt(link: str):
for site_key in SITE_ALTS.keys():
if site_key not in link:
continue
link = link.replace(site_key, SITE_ALTS[site_key])
break
return link
def filter_link_args(query_link):
parsed_link = urlparse.urlparse(query_link)
link_args = parse_qs(parsed_link.query)
safe_args = {}
if len(link_args) == 0 and len(parsed_link) > 0:
return query_link
for arg in link_args.keys():
if arg in SKIP_ARGS:
continue
safe_args[arg] = link_args[arg]
# Remove original link query and replace with filtered args
query_link = query_link.replace(parsed_link.query, '')
if len(safe_args) > 0:
query_link = query_link + urlparse.urlencode(safe_args, doseq=True)
else:
query_link = query_link.replace('?', '')
return query_link
def gen_nojs(sibling):
nojs_link = BeautifulSoup().new_tag('a')
nojs_link['href'] = '/window?location=' + sibling['href']
nojs_link['style'] = 'display:block;width:100%;'
nojs_link.string = 'NoJS Link: ' + nojs_link['href']
sibling.append(BeautifulSoup('<br><hr><br>', 'html.parser'))
sibling.append(nojs_link)

View File

@ -1,5 +1,5 @@
from app.filter import Filter, get_first_link from app.filter import Filter, get_first_link
from app.utils.misc import generate_user_keys from app.utils.session_utils import generate_user_keys
from app.request import gen_query from app.request import gen_query
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from cryptography.fernet import Fernet, InvalidToken from cryptography.fernet import Fernet, InvalidToken

View File

@ -2,11 +2,6 @@ from cryptography.fernet import Fernet
from flask import current_app as app from flask import current_app as app
REQUIRED_SESSION_VALUES = ['uuid', 'config', 'fernet_keys'] REQUIRED_SESSION_VALUES = ['uuid', 'config', 'fernet_keys']
BLACKLIST = [
'ad', 'anuncio', 'annuncio', 'annonce', 'Anzeige', '广告', '廣告', 'Reklama', 'Реклама', 'Anunț', '광고',
'annons', 'Annonse', 'Iklan', '広告', 'Augl.', 'Mainos', 'Advertentie', 'إعلان', 'Գովազդ', 'विज्ञापन', 'Reklam',
'آگهی', 'Reklāma', 'Reklaam', 'Διαφήμιση', 'מודעה', 'Hirdetés'
]
def generate_user_keys(cookies_disabled=False) -> dict: def generate_user_keys(cookies_disabled=False) -> dict:

View File

@ -8,7 +8,7 @@ setuptools.setup(
author='Ben Busby', author='Ben Busby',
author_email='benbusby@protonmail.com', author_email='benbusby@protonmail.com',
name='whoogle-search', name='whoogle-search',
version='0.2.0', version='0.2.1',
include_package_data=True, include_package_data=True,
install_requires=requirements, install_requires=requirements,
description='Self-hosted, ad-free, privacy-respecting Google metasearch engine', description='Self-hosted, ad-free, privacy-respecting Google metasearch engine',

View File

@ -1,5 +1,5 @@
from app import app from app import app
from app.utils.misc import generate_user_keys from app.utils.session_utils import generate_user_keys
import pytest import pytest

View File

@ -1,4 +1,4 @@
from app.utils.misc import generate_user_keys, valid_user_session from app.utils.session_utils import generate_user_keys, valid_user_session
def test_generate_user_keys(): def test_generate_user_keys():

View File

@ -1,6 +1,6 @@
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from app.filter import Filter from app.filter import Filter
from app.utils.misc import generate_user_keys from app.utils.session_utils import generate_user_keys
from datetime import datetime from datetime import datetime
from dateutil.parser import * from dateutil.parser import *
@ -55,7 +55,7 @@ def test_recent_results(client):
result_divs = get_search_results(rv.data) result_divs = get_search_results(rv.data)
current_date = datetime.now() current_date = datetime.now()
for div in result_divs: for div in [_ for _ in result_divs if _.find('span')]:
date_span = div.find('span').decode_contents() date_span = div.find('span').decode_contents()
if not date_span or len(date_span) > 15 or len(date_span) < 7: if not date_span or len(date_span) > 15 or len(date_span) < 7:
continue continue