Commit Graph

297 Commits (809520ec707d9f8d3f70965079d358e9ad8a6420)

Author SHA1 Message Date
Laurent le Beau-Martin 1a3790c7b1
Only open external links in a new tab (#380) 2021-08-24 09:06:41 -06:00
සයුරි | Sayuri 8e91564600
Update translations (#373) 2021-07-22 09:13:09 -06:00
Ben Busby 694642ccb3
Set bg color for "top stories" elements 2021-07-05 00:18:28 -04:00
Ben Busby 38c38a772f
Find valid parent element when collapsing result content
Previously if a result element marked for collapsing didn't have a valid
"parent" element, the collapsing was skipped altogether. This loops
through child elements until a valid parent is found (or if one isn't
found, the element will not be collapsed).
2021-07-04 15:20:19 -04:00
Ben Busby 13202cc6b1
Ensure existence of static build dir 2021-07-02 16:21:38 -04:00
Ben Busby 68fdd55482
Use cache busting for css/js files
On app init, short hashes are generated from file checksums to use for
cache busting. These hashes are added into the full file name and used
to symlink to the actual file contents. These symlinks are loaded in the
jinja templates for each page, and can tell the browser to load a new
file if the hash changes.

This is only in place for css and js files, but can be extended in the
future for other file types if needed.
2021-06-30 19:00:01 -04:00
Ben Busby c41e0fc239
Allow theme to mirror user system settings
Introduces a new config element and environment variable
(WHOOGLE_CONFIG_THEME) for setting the theme of the app. Rather than
just having either light or dark, this allows a user to have their
instance use their current system light/dark preference to determine the
theme to use.

As a result, the dark mode setting (and WHOOGLE_CONFIG_DARK) have been
deprecated, but will still work as expected until a system theme has
been chosen.
2021-06-28 10:26:51 -04:00
Ben Busby afd01820bb
Collapse long result sections into details/summary elements
Sections such as "People also asked" and "related searches" typically
take up a lot of room on the results page, and don't always have the
most useful information. This checks for result elements with more than
7 child divs, extracts the section title, and wraps all elements in a
"details" element that can be expanded/collapsed by the user.

Note that this functionality existed previously (albeit not implemented
as well), but due to changes in how Google returns searches (switching
from using <h2> elements for section headers to <span> or <div>
elements), the approach to collapsing these sections needed to be
updated.
2021-06-23 18:59:57 -04:00
Ben Busby d894bd347d
Handle error when parsing image result url 2021-06-16 10:40:18 -04:00
Ben Busby b21b4f4f57
Skip parsing user agent if absent from request 2021-06-16 10:37:33 -04:00
Ben Busby bcb1d8ecc9
Add lingva translation support in search (#360)
* Add support for Lingva translations in results

Searches that contain the word "translate" and are normal search queries
(i.e. not news/images/video/etc) now create an iframe to a Lingva url to
translate the user's search using their configured search language.

The Lingva url can be configured using the WHOOGLE_ALT_TL env var, or
will fall back to the official Lingva instance url (lingva.ml).

For more info, visit https://github.com/TheDavidDelta/lingva-translate

* Add basic test for lingva results

* Allow user specified lingva instances through csp frame-src

* Fix pep8 issue
2021-06-15 10:14:42 -04:00
deluxghost 82ccace647
Add zh-CN translation (#355) 2021-06-11 11:33:01 -04:00
Aikatsui a6b4252210
Add Sinhala translation (#353) 2021-06-11 10:22:25 -04:00
Ben Busby 904091f440
Bump version to 0.5.4 2021-06-06 13:45:03 -04:00
Ben Busby 44b0fe519c
Revert changes to default language config
A recent issue brought up a good point about how the latest changes to
setting default language to english break functionality for bilingual
users. The change was likely not the best solution for users who were
being affected by IP geolocation on their instances -- the right
solution for that would be to configure the interface/search language to
their preference instead.
2021-06-06 13:39:06 -04:00
Ben Busby e7a604d428
Fix handling of http (vs https) proxy creation
The requests library requires both 'http' and 'https' values in any
included proxy dict, and whoogle was previously copying the http proxy
to https for simplicity. The assumption was that if the underlying
request wasn't able to connect via https, it would default to http
(otherwise why have the requirement to specify both?)

This led to connectivity issues for users with http only proxies as of
the latest urllib and requests package versions, which are a lot more
strict with connections over https. With the latest versions, if an
https connection cannot be made, the library returns an error.

As a result, the new proxy dict must look something like this for plain
http proxies:

{'http': 'http://domain.tld:port', 'https': 'http://domain.tld:port'}

where both http and https are identical, but both are still required.
2021-06-04 15:30:21 -04:00
Ben Busby a64a86efb6
Bump version to 0.5.3 2021-06-04 11:31:03 -04:00
Ricardo 9d024cffce
Add Portuguese translation (#345)
* Add Portuguese translation

* Update translations.json
2021-06-04 11:16:58 -04:00
Ben Busby 614dceeb70
Add fallback interface/search lang + cleanup
Since the interface language defaults to IP geolocation by google, the
default language is now set to english. Still not sure if this is the
best solution, but at least temporarily should clear up some confusion
for users with instances deployed in countries outside of their own.

Also performed some minor cleanup:
  - Updated name of strip_blocked_sites to clean_query
  - Added clean_query to list of jinja template functions
  - Ensured site block list doesn't contain duplicate filters
2021-06-04 11:09:30 -04:00
bruvv 3892355199
Add Dutch translation (#343) 2021-06-03 09:24:59 -04:00
Myzel394 7103d9eccb
Add German translation (#339)
* Added german language

* Fixed translations.json path

* Fixed German name
2021-06-01 19:57:48 -04:00
Ben Busby cbe32a081e
Hotfix: extract only 'q' element from query string
Occasionally the search results will contain links with arguments such
as 'dq', which was being erroneously used in attempts to extract the 'q'
element from query strings. This enforces that only links with '?q=' or
'&q=' (elements with a standalone 'q' arg) will have the element
extracted.

I also refactored the naming of this element once extracted to be just
'q'. Although this seems counterintuitive, it makes a little more sense
since this element is the one we're extracting. It's a vague url arg
name, but it is what it is.

Bump version to 0.5.2 for hotfix release
2021-05-29 12:22:37 -04:00
Ben Busby 43faaee77f
Hotfix: remove site filter for maps links
The new site filter breaks links to Maps results, so filter.py needed
to be updated to handle these links as a unique case. A new method was
introduced to easily remove any "-site:..." filters from the query,
which is now also used to format queries in the header template rather
than manually removing the blocked site list within the template itself.

Bumps version to 0.5.1 for releasing the bugfix

Fixes #329
2021-05-27 12:01:57 -04:00
Federico Torrielli cf55765933
Add italian localization (#327) 2021-05-25 09:51:05 -04:00
Ben Busby 4649d96dda
Support basic localization (#325)
* Replace hardcoded strings using translation json file

This introduces a new "translations.json" file under app/static/settings
that is loaded on app init and uses the user config value for interface
language to determine the appropriate strings to use in Whoogle-specific
elements of the UI (primarily only on the home page).

* Verify interface lang can be used for localization

Check the configured interface language against the available
localization dict before attempting to use, otherwise fall back to
english.

Also expanded language names in the languages json file.

* Add test for validating translation language keys

Also adds Spanish translation to json (the only non-English language I
can add and reasonably validate on my own).

* Validate all translations against original keyset, update readme

Readme has been updated to include basic contributing guidelines for
both code and translations.
2021-05-24 17:03:02 -04:00
Joao A. Candido Ramos 448efb8f2a
Add "view image" functionality (#268)
* add view image option

* prevent whoogle links from opening in a new tab.

* remove view image template on mobile requests

* change loop values to be more robust to the number of images

* Update app/templates/imageresults.html

* fix "Basically the .cvifge class needs width: 100%; in order to expand the search input to fit the form width."

* Update app/templates/imageresults.html

* remove hardcoded string from template

* Add view image config var to app.json

* Add view image config var to whoogle.env

Co-authored-by: jacr13 <ramos.joao@protonmail.com>
Co-authored-by: Ben Busby <benbusby@protonmail.com>
2021-05-21 11:19:45 -04:00
Ben Busby fcfa3783e3
Bump version to 0.5.0 2021-05-21 10:50:07 -04:00
Ben Busby d5eebe9fe5
Add iframe-able search page for insertion into other sites
Introduces a new html template, search.html, which provides a very basic
form for submitting search queries.

Closes #319
2021-05-21 10:35:46 -04:00
Ben Busby 1fdf226802
Use curl-based healthcheck w/ new non-auth route
The wget method seemed to have a possible issue with creating endless
index.html copies (despite being specified to output to console only),
so this has been updated to use curl instead.

Also uses new non-authenticated "healthz" route to perform the
healthcheck.

Fix #316

Fix #313
2021-05-18 11:48:15 -04:00
bruvv 27b6d05b6a
Fix EU consent bug (#320)
* Update request.py

* Use current date to format EU consent cookie

Co-authored-by: Ben Busby <benbusby@protonmail.com>
2021-05-18 10:52:24 -04:00
Harsh Barsaiyan 4466bbc8f4
Add divider to user-defined CSS (#310) 2021-05-11 12:26:37 -04:00
Ben Busby 05995649f3
Hotfix: check for site filters before modifying query
The previous method of removing all site filters from the search query
removed the last letter of the search. This only applies the substring
filter if any site filters are present in the query.

Fixes #306
2021-05-10 12:07:55 -04:00
Ben Busby c8da53d4b0
Block websites from search results via user config (#304)
* Block websites in search results via user config

Adds a new config field "Block" to specify a comma separated list of
websites to block in search results. This is applied for all searches.

* Add test for blocking sites from search results

* Document WHOOGLE_CONFIG_BLOCK usage

* Strip '-site:' filters from query in header template

The 'behind the scenes' site filter applied for blocked sites was
appearing in the query field when navigating between search categories
(all -> images -> news, etc). This prevents the filter from appearing in
all except "images", since the image category uses a separate header.
This should eventually be addressed when the image page can begin using
the standard whoogle header, but until then, the filter will still
appear for image searches.
2021-05-07 11:45:53 -04:00
Ben Busby a7bf9728e3
Allow 'data:' for img src in app CSP
Disallowing base64 images in the app resulted in broken image
placeholders for things like pronunciation guides, business reviews,
etc.
2021-05-05 12:51:11 -04:00
Angel Mario d6d7110e22
Add option to disable changing config from client (#295)
* Add option to disable changing of configuration

Introduces a test to ensure the correct response code is found when
attempting to update the config when disabled, and ensure default config
is unchanged when posting a new config dict.

Attempting to update the config using the API when disabled now returns
a 403 code + redirect.

Co-authored-by: Ben Busby <benbusby@protonmail.com>
2021-04-27 10:36:03 -04:00
Ben Busby 8ae7b5947e
Separate interface language from search language in env vars
The search language is now set using the WHOOGLE_CONFIG_SEARCH_LANGUAGE
environment variable. Interface language is still set using
WHOOGLE_CONFIG_LANGUAGE.

Fixes #260
2021-04-26 11:38:55 -04:00
Ben Busby f56e913521
Remove gap between input and result types
Enforces 0 margin for the search input form on the result page, which
removes the weird gap that is seen by default.

Also made minor changes to the border styling. Desktop searches now have
a single bottom border in dark mode rather than an all around border,
and the border around the mobile search result input was removed
entirely.
2021-04-22 16:24:43 -04:00
Ben Busby 5b963b441c
Focus search input after clearing w/ reset btn
See #291
2021-04-22 10:02:15 -04:00
Ben Busby 01fe0c02a5
Add button to clear search input on mobile
This was unfortunately a bit more complex than just adding an HTML reset
button, since reset buttons only "reset" input content to its original
value rather than clearing it. This doesn't work for Whoogle's needs,
since inputs on search result pages are auto populated with the search
content as their default value.

A reset button was introduced anyways, but is controlled by a few lines
of javascript to allow completely clearing the search input. The button
will only appear on mobile searches.

At the moment, it isn't particularly pretty, but is functional. It uses
just a plain "x" character and is always visible on mobile search result
pages. This leaves plenty of room for improvement moving forward.

Fixes #291
2021-04-21 11:38:19 -04:00
Ben Busby 7136197e5d
Fix missing text style for active search suggestions 2021-04-21 10:49:27 -04:00
Ben Busby 2eb33007f7
Disable autocorrect on mobile search inputs
Fixes #292
2021-04-21 10:48:26 -04:00
Ben Busby d2fac809ca
Fix mishandling of empty config environment variables
The recent change to cast bool config vars as ints to handle a '0' or
'1' value was shortsighted, since it doesn't allow for instances where
the variable is set to an empty value (or '' or any invalid/non-int
value).

This introduces a read_config_bool method for reading values that should
be a '0' or '1', but will default to False if not a digit (otherwise the
value will be cast as bool(int(value)) if "value" is a digit str).

Fixes #288
2021-04-14 10:42:41 -04:00
Ben Busby baa7a87efb
Fix incorrect config bool env var casting
Config boolean environment variables need to be cast to ints, since
they are set or unset using 0 and 1. Previously they were interpreted as
(pseudocode) read_var(name, default=False), which meant that setting
CONFIG_VAR=0 would enable that variable since Python reads environment
variables as strings, and '0' is truthy. This updates the previous logic
to (still pseudocode) int(read_var(name, default='0')).

Fixes #279
2021-04-12 16:40:59 -04:00
Ben Busby b7e48a9597
Replace remaining hardcoded theme values
Both light and dark themes have been updated to remove the leftover
hardcoded values (mostly related to the search suggestion styling).

See discussion in #247.
2021-04-12 10:22:34 -04:00
Ben Busby 1030118d0b
Expand custom css theming support
Also adds new default dark theme designed by @gripped.
2021-04-09 11:00:02 -04:00
gripped 13abb0ae7f
Add .BVG0Nb to dark-theme.css 2021-04-09 10:57:23 -04:00
Ben Busby ed32fb927c
Disable logging from imported modules
The logging from imported modules (stem, in particular) has caused quite
a few users to assume there are errors where there aren't any. The logs
from stem also aren't helpful, as everything in the library works as
expected despite the implication from the logs that it is not working.
2021-04-09 09:26:16 -04:00
Ben Busby a321d55f13
Hotfix: Send generic "Mozilla" in user agent
Randomizing the "Mozilla" portion of the user agent changed the
character encoding to GB2312. Setting it to plain "Mozilla" enforces
UTF-8 encoding.

Bump to version 0.4.1 for release of bug fix

Fixes #267
2021-04-08 09:43:41 -04:00
Ben Busby 30be540b97 Bump version to 0.4.0 2021-04-05 11:00:56 -04:00
Ben Busby 0b9600b564 Expand custom css variables and functionality
Squashed commit of the following:

commit 37e22d2945b077a94d9997d064f4355ff8819bae
Author: Ben Busby <benbusby@protonmail.com>
Date:   Mon Apr 5 10:27:05 2021 -0400

    Pass user config to logo template

commit 2406fee05c3e221112fbe802fbf2ecca1df99127
Author: Ben Busby <benbusby@protonmail.com>
Date:   Mon Apr 5 10:24:54 2021 -0400

    Fix incorrect contrast text in dark theme

commit 91dd677e22c2e99819123154e03e9f519f95a9bd
Author: Ben Busby <benbusby@protonmail.com>
Date:   Fri Apr 2 17:21:38 2021 -0400

    Remove inline onclicks, fix svg sizing

commit 91bbf9c0fae36febd6a6a0d8e6a560babe8622d5
Merge: 72637df b1227bd
Author: Ben Busby <benbusby@protonmail.com>
Date:   Fri Apr 2 15:35:37 2021 -0400

    Merge remote-tracking branch 'origin/develop' into custom-css-tweaks

commit 72637df213f4b9e83e4b58fe76973de02f63ec8e
Author: Ben Busby <benbusby@protonmail.com>
Date:   Fri Apr 2 11:38:38 2021 -0400

    Use svg logo w/ custom styling on results pages

commit 666a7ceac4a6e4d3fe1975dcee91e6094b66149e
Author: Ben Busby <benbusby@protonmail.com>
Date:   Fri Apr 2 11:10:37 2021 -0400

    Split whoogle-accent into whoogle-element-bg and whoogle-logo

    See discussion on #247
2021-04-05 11:00:56 -04:00
Ben Busby 50c888f9a7 Revert heroku app https upgrade fix 2021-04-05 11:00:56 -04:00
Ben Busby df0b7afa50 Switch to single Fernet key per session
This moves away from the previous (messy) approach of using two separate
keys for decrypting text and element URLs separately and regenerating
them for new searches. The current implementation of sessions is not very
reliable, which lead to keys being regenerated too soon, which would
break page navigation. Until that can be addressed, the single
key per session approach should work a lot better.

Fixes #250

Fixes #90
2021-04-05 11:00:56 -04:00
Ben Busby ed4432f3f8 Hotfix: Upgrade heroku apps to https for all endpoints
The previous implementation of the is_heroku check in
search.needs_https() was implemented to only match URLs ending in
'.herokuapp.com', and skipped upgrading to HTTPS for other endpoints.
2021-04-05 11:00:56 -04:00
Ben Busby 7b9ee37beb Allow defining initial config state w/ env vars
This introduces a set of environment variables that can be used for
defining initial config state, to expedite the process of
destroying/relaunching instances quickly with the same settings every
time.

Closes #228

Closes #195
2021-04-05 11:00:56 -04:00
Shimul 8a10efaa01 Allow setting environment variables in whoogle.env (#237)
This allows the user to enable their preferred settings in a variety of
ways, depending on their deployment preference. Values added to
whoogle.env can be enabled using WHOOGLE_DOTENV=1, in which case all
values in the env var file will overwrite defaults or user provided
settings.

Co-authored-by: Ben Busby <benbusby@protonmail.com>
2021-04-05 11:00:56 -04:00
Ben Busby 8ad8e66d37 Improve static typing throughout repo
Eventually this should be part of a separate mypy ci build, but right
now it's just a general guideline. Future commits and PRs should be
validated for static typing wherever possible.

For reference, the testing commands used for this commit were:

mypy --ignore-missing-imports --pretty --disallow-untyped-calls app/
mypy --ignore-missing-imports --pretty --disallow-untyped-calls test/
2021-04-05 11:00:56 -04:00
Shimul 892b646a4e Configure PWA for mobile browsers (#234)
Fix PWA issue for mobile phones
Fix icon loading issue
Update app/static/img/favicon/manifest.json

Co-authored-by: Ben Busby <benbusby@pm.me>
2021-04-05 11:00:56 -04:00
Ben Busby e7c63afc1a Re-add search css to results page
The results page search css was removed during the refactor to allow for
user defined css. This adds that back.
2021-04-05 11:00:56 -04:00
Ben Busby 083c3758a1 Return 503 if response is blocked by captcha
Also added in a slight modification to the dark theme style, which
should only apply the border radius in the header.

Closes #226
2021-04-05 11:00:56 -04:00
Ben Busby 62a9b9e949 Allow user-defined CSS/theming (#227)
* Add custom CSS field to config

This allows users to set/customize an instance's theme and appearance to
their liking. The config CSS field is prepopulated with all default CSS
variable values to allow quick editing.

Note that this can be somewhat of a "footgun" if someone updates the
CSS to hide all fields/search/etc. Should probably add some sort of
bandaid "admin" feature for public instances to employ until the whole
cookie/session issue is investigated further.

* Symlink all app static files to test dir

* Refactor app/misc/*.json -> app/static/settings/*.json

The country/language json files are used for user config settings, so
the "misc" name didn't really make sense. Also moved these to the static
folder to make testing easier.

* Fix light theme variables in dark theme css

* Minor style tweaking
2021-04-05 11:00:56 -04:00
Shimul 337d0ebe37 Handle manifest-src in CSP (#231) 2021-04-05 11:00:56 -04:00
Ben Busby e5d1f6a292 Add healthcheck to Dockerfile
See #184
2021-04-05 11:00:56 -04:00
Ben Busby f8dfc78539 Improve naming of *_utils files, update fn/class doc
The app/utils/*_utils weren't named very well, and all have been updated
to have more accurate names.

Function and class documention for the utils have been updated as well,
as part of the effort to improve overall documentation for the project.
2021-04-05 11:00:56 -04:00
Ben Busby dcb80ac250 Send CSP header in all responses
Introduces a new content security policy header for responses to all
requests to reduce the possibility of ip leaks to outside connections.
By default blocks all inline scripts, and only allows content loaded
from Whoogle.

Refactors a few small inline scripting cases in the project to their own
individual scripts.
2021-04-05 11:00:56 -04:00
Ben Busby d146016860 Remove auth req for accessing opensearch
Requiring authentication for accessing the opensearch template prevents
the browser from accessing the file when adding as a default search
engine. This removes the authentication requirement from the opensearch
route, which should never provide any sensitive information anyways.
2021-04-05 11:00:56 -04:00
Ben Busby ecb7885a56 Allow bang operator anywhere in query
Bang operator can now be placed anywhere in the query, to allow for peak
efficiency in stream of consciousness querying (i.e. `big !reddit
chungus` will search reddit for big chungus`).

Fixes #196
2021-04-05 11:00:56 -04:00
Ben Busby 64567a63ea Ensure G logo doesn't appear in mobile img results
Adds a separate check to remove all images sourced from www.gstatic.com,
which is where the mobile logo in particular is coming from.
2021-04-05 11:00:56 -04:00
Tomasz Borychowski 03bd4b6871 fix 'j' and 'k' inside search input 2021-04-05 11:00:56 -04:00
Roman Štefko 7f3a284e04 Do not autocapitalize on index page search bar (#200) 2021-04-05 11:00:56 -04:00
Tomasz Borychowski 5538ac862e add basic keyboard support 2021-04-05 11:00:56 -04:00
Ben Busby 6600d8580c Add ability to redirect reddit.com to libredd.it (#180)
* Adds the ability to redirect reddit.com to libredd.it using the existing
 "site alts" config setting.

This adds the WHOOGLE_ALT_RD environment variable for optionally
redirecting reddit links to libreddit
(https://github.com/spikecodes/libreddit).

* Include libreddit in home page site alt note
2021-04-05 11:00:56 -04:00
Ben Busby b57c86a1d0
Bump version to 0.3.2 2021-04-02 12:57:15 -04:00
Ben Busby fdd4ee590f
Hotfix: Set EU consent cookie to pending for all requests
See discussion on #243
2021-04-02 12:32:59 -04:00
Ben Busby 0a6575d219
Hotfix: Move language/country json to app dir
Pip installs of whoogle search were missing access to the misc/ folder,
which previously contained the language and country json files. These
have been moved to app/misc, and the previous root level misc/ was
renamed to config/ (since it now only contains the tor config files).

Bump to 0.3.1.
2021-02-07 18:55:27 -05:00
Ben Busby 329c38efb0
Hotfix: Enforce https in heroku opensearch template
Heroku instances were using the base http url when formatting the
opensearch.xml template. This adds a new routing utility, "needs_https",
which can be used for determining if the url in question needs
upgrading.
2021-01-23 14:50:30 -05:00
Ben Busby 5c69283e80
Hotfix: Add hidden submit btn for nojs searches
With javascript disabled, searches could not be submitted on the results
page using the "Enter" key. Adding a hidden submit button to the header
template resolves this issue.
2021-01-19 11:11:13 -05:00
Ben Busby 406e236666
Bump version to 0.3.0 2021-01-17 23:07:43 -05:00
Ben Busby 440c4e9c50
Remove lxml dependency
The lxml dependency in the project was fairly unnecessary, and made the
initial build time for the project considerably slower. This replaces
all instances of lxml with either the default html parser (for bs4
constructors) or the built in xml.etree package (for search suggestion
parsing).
2020-12-29 18:43:42 -05:00
Ben Busby 2bbc649903
Add support for UPS/USPS/FedEx tracking queries
Introduces a new javascript "utils" file, which includes a check for
matching the query against a set of tracking number regexes on page
load. If a match is found, the script prepends a link to the
(presumably) appropriate tracking page.

Referenced in #98
2020-12-27 18:00:35 -05:00
Ben Busby 6e7ec9918a
Move language/country settings to app config
Moves the language and country dicts from the config model to json files
that are loaded during app init and stored in the app config dict. This
substantially improves the readability of the config model and allows
for much more sensible loading of the language/country options.
2020-12-17 16:42:05 -05:00
Ben Busby 375f4ee9fd
PEP-8: Fix formatting issues, add CI workflow (#161)
Enforces PEP-8 formatting for all python code

Adds a github action build for checking pep8 formatting using pycodestyle
2020-12-17 16:06:47 -05:00
Ziga Zajc b55aad3fdf
Use #222 for dark mode bg (#159) 2020-12-17 16:03:05 -05:00
Ben Busby b695179c79
Add ability to collapse "people also ask"
This adds a step in the filter process to wrap the "people also ask"
section in a <details> element, which automatically collapses the
contents of the section. Clicking/tapping the details element expands
the view as normal.

See #113
2020-12-15 11:09:48 -05:00
Ben Busby 3978241d28
Fix black text in dark mode dropdowns
Closes #145
2020-12-15 10:48:29 -05:00
Ben Busby 5b5c2588ed
Fix nojs lxml constructor
The BeautifulSoup constructur in gen_nojs needed to explicitly set
features='lxml' to silence a warning from the library.

Also temporarily disabled the site alts test since the results are too
unreliable. This should be moved to a unit test instead.
2020-12-11 19:21:32 -05:00
Ben Busby e6db3112f7
Fix pagination bug for pages > 3
The pagination footer on the results page after page 2 has three actions
(beginning, next, previous). The footer filter was updated to remove
items with more than three actions to fix this.

See #131
2020-12-07 20:38:57 -05:00
Ben Busby 6c429e6dd1
Allow setting site alts using environment vars (#155)
* Add ability to configure site alts w/ env vars

Site alternatives (i.e. twitter.com -> nitter.net) can now be configured
using environment variables:

WHOOGLE_ALT_TW='nitter.net' # twitter alt
WHOOGLE_ALT_YT='invidio.us' # youtube alt
WHOOGLE_ALT_IG='bibliogram.art/u' # instagram alt

Updated testing to confirm results have been modified.

* Add site alt vars to docker settings and readme
2020-12-05 17:01:21 -05:00
Ben Busby 44a5da1895
Fix heroku https upgrade, add funding options
Heroku app instances have been notoriously bad at having the instance
automatically upgraded to https. This adds a step in the before request
decorator to always upgrade heroku apps, since they're always deployed
with the certificate, but never configured to upgrade automatically.

Fixes #153
2020-12-05 15:53:42 -05:00
Ben Busby 54109874fb
Move screenshots/branding to separate docs folder 2020-12-04 10:53:12 -05:00
Ben Busby 2d0823b012
Hotfix: Remove mobile subdomain for invidious redirect
See #151
2020-11-28 21:30:58 -05:00
Ben Busby 0afd59056f
Hotfix: update invidious url, remove www from link
The invidious instance has been updated to invidious.snopyta.org, since
this instance is more reliable and has more users according to
instances.invidio.us

All site alternative redirects now redirect without the 'www' subdomain,
since most of the alternative sites don't have this subdomain set up.
2020-11-28 12:15:04 -05:00
Ben Busby 0d0f32d108
Hotfix: update ad filter for portugese config 2020-11-24 13:14:40 -05:00
Ben Busby a519de90af
Enforce GET-only in opensearch for Chrome
The resolution for enabling full support for search + suggestions in
Chrome is to remove the "method" tag altogether for any Chrome based
browser. Any inclusion of this tag seems to break the search suggestion
feature, and makes the user add the search engine manually.
2020-11-18 10:31:19 -05:00
Ben Busby 72cbc342af Add ability to set temp config in search query
Dark mode, country, interface language, and search language configs
can now be set in the search query by appending each option as a
url parameter.

Supported args are: 'dark', 'lang_search', 'lang_interface', and 'ctry'

Ex: /search?q=%s&dark=1&lang_search=lang_en...

These config settings persist across page navigation and switching
result type, but will be reset if the main search bar is used.

See #144
2020-11-11 00:40:49 -05:00
Ben Busby f88d1fbb66 Fix main page visibility for noscript users
The body tag of the home page was previously hidden until the page was
finished loading to prevent a flash of unstyled content, but this broke
functionality for users who disallow javascript. This adds in a new
noscript tag to manually enable visibility of the body element, as well
as automatically displaying the config section (since its visibility is
also typically handled by javascript).
2020-11-03 10:41:29 -05:00
bugbounce 1148a7fb8d
Use relative links instead of absolute (#139)
* Use relative links instead of absolute

This allows for hosting under a subpath. For example if you want to host
whoogle at example.com/whoogle, it should work better with a reverse proxy.

* Use relative link for opensearch.xml
2020-10-29 11:09:31 -04:00
Ben Busby 933ce7e068 Handle FF sending bad search suggestion param
Occasionally, Firefox will send the search suggestion
string to the server without a mimetype, resulting in the suggestion
only appearing in Flask's `request.data` field. This field is typically
not used for parsing arguments, as the documentation states:

Contains the incoming request data as string in case it came with a
mimetype Flask does not handle.

This fix captures the bytes object sent to the server and parses it into
a normal query to be used in forming suggestions.
2020-10-28 23:02:41 -04:00
Ben Busby 0ef098069e
Add tor and http/socks proxy support (#137)
* Add tor and http/socks proxy support

Allows users to enable/disable tor from the config menu, which will
forward all requests through Tor.

Also adds support for setting environment variables for alternative
proxy support. Setting the following variables will forward requests
through the proxy:
    - WHOOGLE_PROXY_USER (optional)
    - WHOOGLE_PROXY_PASS (optional)
    - WHOOGLE_PROXY_TYPE (required)
      - Can be "http", "socks4", or "socks5"
    - WHOOGLE_PROXY_LOC  (required)
      - Format: "<ip address>:<port>"

See #30

* Refactor acquire_tor_conn -> acquire_tor_identity

Also updated travis CI to set up tor

* Add check for Tor socket on init, improve Tor error handling

Initializing the app sends a heartbeat request to Tor to check for
availability, and updates the home page config options accordingly. This
heartbeat is sent on every request, to ensure Tor support can be
reconfigured without restarting the entire app.

If Tor support is enabled, and a subsequent request fails, then a new
TorError exception is raised, and the Tor feature is disabled until a
valid connection is restored.

The max attempts has been updated to 10, since 5 seemed a bit too low
for how quickly the attempts go by.

* Change send_tor_signal arg type, update function doc

send_tor_signal now accepts a stem.Signal arg (a bit cleaner tbh). Also
added the doc string for the "disable" attribute in TorError.

* Fix tor identity logic in Request.send

* Update proxy init, change proxyloc var name

Proxy is now only initialized if both type and location are specified,
as neither have a default fallback and both are required. I suppose the
type could fall back to http, but seems safer this way.

Also refactored proxyurl -> proxyloc for the runtime args in order to
match the Dockerfile args.

* Add tor/proxy support for Docker builds, fix opensearch/init

The Dockerfile is now updated to include support for Tor configuration,
with a working torrc file included in the repo.

An issue with opensearch was fixed as well, which was uncovered during
testing and was simple enough to fix here. Likewise, DDG bang gen was
updated to only ever happen if the file didn't exist previously, as
testing with the file being regenerated every time was tedious.

* Add missing "@" for socks proxy requests
2020-10-28 20:47:42 -04:00
Ben Busby f3bb1e22b4 Fix improper header styling, remove shopping tab links
The header template was using Google's classes for the "Whoogle" logo,
which meant keeping up with their list of colors used in the logo. The
template was updated to only ever use the Whoogle logo color.
Accordingly, the logo specific styling in filter.py was removed, since
it is no longer needed.

Also removes all links to the shopping tab, as it seems that the
majority of the links to items are Google specific links (usually
google.com/aclk links without any discernible param for determining the
true location for the link). The shopping page should be addressed
separately with unique filtering/formatting. Further tracking of this
task will be followed in #136.
2020-10-25 13:52:30 -04:00
Ben Busby ae05e8ff8b Finished basic implementation of DDG bang feature
Initialization of the app now includes generation of a ddg-bang json
file, which is used for all bang style searches afterwards.

Also added search suggestion handling for bang json lookup. Queries
beginning with "!" now reference the bang json file to pull all keys
that match.

Updated test suite to include basic tests for bang functionality.

Updated gitignore to exclude bang subdir.
2020-10-10 15:55:14 -04:00
Ben Busby 2126742b76
Merge branch 'develop' into develop 2020-10-07 18:38:36 -04:00
Ben Busby b01b6d8c69 Minor change to wording of language config 2020-10-04 14:11:44 -04:00
curlpipe 558e3e1514
Fixed annoying browser autocomplete (#128) 2020-10-04 13:53:37 -04:00
Ben Busby dfb1e81fa1 Added search input auto focus, updated README
The javascript controller has been updated to include a call to focus
the cursor on the search field. This previously had only been seen on
Firefox, and was assumed to be a weird FF-specific bug. Adding in a
timeout to allow elements to finish loading allows the field to be
focused as expected.

Also updated the README to include clarification for IP address
tracking.
2020-09-30 10:26:27 -04:00
Ben Busby 9a03b4111d Clarified country filter, updated invidious result URL (closes #123)
Improves clarity of the meaning behind the "Country" filter -- Google
seemingly uses this value to only return results that are hosted in a
particular country, as evidenced in the search differences highlighted
in #123. It now mentions that the results are filtered by website
hosting location.

Also, now that invidio.us is shut down, the fallback URL (invidiou.site)
is now used instead.
2020-09-17 18:59:37 -04:00
Ben Busby 9afe5f81bd
Updated dark theme (#121)
* Implemented new dark theme

Now uses a dedicated css file for all dark theme color changes, rather
than replacing color codes directly.

Color theme is from discussion in #60.

* Minor link color update
2020-09-14 15:29:58 -04:00
Ben Busby e471b012a0 Updated opensearch template
Reconfigured template to only use method parameter if set to search via
POST request (which is the default).

Apparently Chrome/Chromium based browsers don't like non-GET request
searches, and specifying a method caused Chrome to reject the template
altogether.
2020-08-15 14:03:26 -06:00
Ben Busby 0c0a01b83f Minor opensearch route and description updates
Bumped version to 0.2.1 for next release

Updated image in opensearch template to use base64 image

Updated opensearch route to serve file as attachment
2020-08-15 13:02:17 -06:00
Ben Busby b2ecd8dc78 Updated search suggestion behavior (closes #115)
Arrow key navigation through search suggestions now populates the input
field with text content from the active selection. Navigating "down"
past the end of the suggestions list returns the active cursor to position 0,
while navigating "up" before the list of suggestions restores the
original search query and removes the active highlight from element 0.
2020-08-15 11:58:16 -06:00
Ben Busby 975ece8cd0
Privacy respecting alternatives in results view (#106)
Full implementation of social media alt redirects (twitter/youtube/instagram -> nitter/invidious/bibliogram) depending on configuration.

Verbatim search and option to ignore search autocorrect are now supported as well.

Also cleaned up the javascript side of whoogle config so that it now
uses arrays of available fields for parsing config values instead of manually assigning each
one to a variable.

This doesn't include support for Google Maps -> Open Street Maps, that
seems a bit more involved than the social media redirects were, so it
should likely be a separate effort.
2020-07-26 11:53:59 -06:00
Marvin Borner 5575bcd0af
Merge branch 'develop' into develop 2020-06-28 11:11:53 +02:00
Joao A. Candido Ramos bf4bf1ff2c
Split interface and results language config (#89)
Adding support to choose separately the language of search and the one for the interface (allowing a default givent by google).

Co-authored-by: Joao <ramos.joao@protonmail.com>
2020-06-27 14:23:17 -06:00
Marvin Borner dd9d87d25b
Added ddg-style !bang-operators
This is a proof of concept! The code works, but uses hardcoded operators
and may be placed in the wrong file/class.
The best-case scenario would be the possibility to use the 13.000+ ddg
operators, but I don't know if that's possible without having to
redirect to duckduckgo first.
2020-06-26 00:26:02 +02:00
Ben Busby ebfa87f561
Fixed dark mode footer text color
Updated to use config accessor rather than boolean value
2020-06-11 21:13:43 -06:00
Ben Busby b2133edaa3
Session refactoring and improved filter (#86)
* Project refactor (#85)

* Major refactor of requests and session management

- Switches from pycurl to requests library
  - Allows for less janky decoding, especially with non-latin character
  sets
- Adds session level management of user configs
  - Allows for each session to set its own config -- users with blocked cookies fall back to the "default" profile (same usage as before)
- Updates key gen/regen to more aggressively swap out keys after each
request

* Added ability to save/load configs by name

- New PUT method for config allows changing config with specified name
- New methods in js controller to handle loading/saving of configs

* Result formatting and removal of unused elements

- Fixed question section formatting from results page (added appropriate
padding and made questions styled as italic)
- Removed user agent display from main config settings

* Minor change to save config button label (now "Save As...")

* Fixed issue with "de-pickling" of flask session

Having a gitignore-everything ("*") file within a flask session folder seems to cause a
weird bug where the state of the app becomes unusable from continuously
trying to prune files listed in the gitignore (and it can't prune '*').

* Switched to pickling saved configs

* Updated ad/sponsored content filter and conf naming

Configs are now named with a .conf extension to allow for easier manual
cleanup/modification of named config files

Sponsored content now removed by basic string matching of span content

* Version bump to 0.2.0

* Fixed request.send return style

* Moved custom conf files to their own directory

* Refactored whoogle session mgmt

Now allows a fallback "default" session to be used if a user's browser
is blocking cookies

* Reworked pytest client fixture to support new session mgmt

* Added better multilingual support, updated filter

Results page now includes method for switching to "All Languages" from
whichever language is specified as the primary in the config (see #74).

Also removes the non-Whoogle links from the page footer, leaving only
the page navigation controls

Added support for the date range filter on the results page, though I'd
still recommend using the ":past <unit>" query instead.

* Removed no-cache enforcement, minor styling/formatting improvements

* Improving ad filtering for non-English languages

* Added footer to results page
2020-06-11 13:38:51 -06:00
Ben Busby 71ba00785f Quick improvement to ad removal 2020-05-29 13:21:53 -06:00
Ben Busby cb18bc6ccc Updated autocomplete styling
Added dark theme specific stylesheet to use if dark mode is active
2020-05-26 10:58:37 -06:00
Ben Busby 78939e7fb4 Reworked google url routing 2020-05-26 10:47:40 -06:00
Ben Busby 98d639883c Fixing styling/url/safe mode inconsistencies 2020-05-26 10:39:19 -06:00
Ben Busby 9212f9921a Fixed #76
Added enter key submit on results page

Added results type carryover for subsequent searches on results page

Removed redundant header on image search results
2020-05-25 10:53:15 -06:00
Ben Busby d1f38cf924 Fixed styling of footer in dark mode 2020-05-25 10:33:24 -06:00
Ben Busby 21012f5265
Feature: autocomplete/search suggestions (#72)
Basic autocomplete/search suggestion functionality added

* Adds new GET and POST routes for '/autocomplete' that accept a string query and returns an array of suggestions

* Adds new autoscript.js file for handling queries on the main page and results view

* Updated requests class to include autocomplete method

* Updated opensearch template to handle search suggestions

* Added header template to allow for autocomplete on results view

* Updated readme to mention autocomplete feature
2020-05-24 14:03:11 -06:00
Ben Busby 3dbe51e9e7 Removing google's filter card from results 2020-05-24 12:53:21 -06:00
Ben Busby 09c53b52af
Feature: country and safe search config options (#71)
* Added country and safe search config options

* Updated handling of parser error in results test

* Improved handling of default country

* Added 1px empty gif fallback as a replacement for images that fail to load
2020-05-23 14:27:23 -06:00
Ben Busby 699aa4f2e7 Bumped version to 0.1.4 2020-05-22 16:08:47 -06:00
Ben Busby b131f47641 Bumped version to v0.1.3
(forgot to update pip package version)
2020-05-22 10:45:49 -06:00
Ben Busby f1e17d8119
Bumped version to v0.1.2 2020-05-22 10:38:58 -06:00
Ben Busby c51f186419 Added version footer, minor PEP 8 refactoring 2020-05-20 11:02:30 -06:00
Ben Busby 38b7b19e2a
Added basic authentication (#51)
Username/password can be set either as Dockerfile build arguments or
passed into the run script as "--userpass <username:password>"
2020-05-18 10:30:32 -06:00
Paul Rothrock 0e39b8f97b
Added "I'm feeling lucky" function (#46)
* Putting '! ' at the beginning of the query now redirects to the first search result

Signed-off-by: Paul Rothrock <paul@movetoiceland.com>

* Moved get_first_url outside of filter class

Signed-off-by: Paul Rothrock <paul@movetoiceland.com>
2020-05-18 10:28:23 -06:00
Ben Busby a4382d59f6
Updated redirect code used in https redirects
See https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections

301 redirections do not keep the request method intact, and can occasionally be changed from POST to GET

308 redirections always keep the request method, which is necessary for all POST search requests
2020-05-16 09:31:07 -06:00
Ben Busby b4165f9957 Minor improvement to https enforcement 2020-05-15 16:29:22 -06:00
Ben Busby 3123789584
Added config option for opening links in new tab (#49) 2020-05-15 16:10:31 -06:00
Ben Busby 1ed6178e9a
Feature: https only -- adds option to enforce https on running instances (#48)
* Adding HTTPS enforcement

Command line runs of Whoogle Search through pip/pipx/etc will need the
`--https-only` flag appended to the run command.

Docker runs require the `use_https` build arg applied.

* Update README.md

Moved https-only note to top of docker run command, updated pip runner help output

* Dockerfile: removed HTTPS enforcement, updated PORT setting

Dockerfile no longer enforces an HTTPS connection, but still allows for
setting via a build arg. The Flask server port is now configurable as a
build arg as well, by setting a port number to "whoogle_port"

* Fixed incorrect port assignment
2020-05-15 15:44:50 -06:00
Ben Busby afd5b9aa83 Minor fix to dark mode on img results 2020-05-15 14:17:16 -06:00
Ben Busby 87f0a8d496
Added volume mounted config to Dockerfile (#39) 2020-05-13 18:27:04 -06:00
Ben Busby f4bd3df2bb
Added option to search only via GET request (#36)
This addresses #18, which brought up the issue of searching with Whoogle
with the search instance set to always use a specific container in
Firefox Container Tabs.

Could also be useful if you want to share your search results or
something, I guess. Though nobody likes when people do that.
2020-05-13 00:19:51 -06:00
Ben Busby a11ceb0a57
Feature: language config (#27)
* Added language configuration support

Main page now has a dropdown for selecting preferred language of
results.

Refactored config to be its own model with language constants.

* Added more language support

Interface language is now updated using the "hl" arg

Fixed chinese traditional and simplified values

Updated decoding of characters to gb2312

* Updated to use conditional decoding dependent on language

* Updated filter to not rely on valid config to work properly
2020-05-12 17:15:53 -06:00
Jake Howard f700ed88e7
Swap out Flask's default web server for Waitress (#32)
* Ignore venv when building docker file

* Remove reference to 8888 port

It wasn't really used anywhere, and setting it to 5000 everywhere removes ambiguity, and makes things easier to track and reason about

* Use waitress rather than Flask's built in web server

It's not production grade

* Actually add waitress to requirements

Woops!
2020-05-12 17:14:55 -06:00
Ben Busby 445019d204 Fixed RAM usage bug
Pushing straight to master since this is an extremely simple fix, with
a pretty large performance benefit.

The Phyme library used for generating a User Agent rhyme was consuming
an absolute unit of memory. Now that it's removed, it's using about 10x
less memory, at the cost of User Agents being not as funny anymore.
2020-05-12 00:45:56 -06:00
Ben Busby 1798b6094d
Merge pull request #16 from Kombustor/patch-1
Add autofocus to input field
2020-05-10 14:09:42 -06:00
Ben Busby 7ccad2799e Added config option to address instance behind reverse proxy
Config options now allow setting a "root url", which defaults to the
request url root. Saving a new url in this field will allow for proper
redirects and usage of the opensearch element.

Also provides a possible solution for #17, where the default flask redirect method redirects to
http instead of https.
2020-05-10 13:27:02 -06:00
Fabian Schliski 743caf6cc7
Updating autofocus value 2020-05-10 20:17:32 +02:00
Fabian Schliski 0fc5fa9d99
Add autofocus to input field
Supported in all major browsers, allows the user to immediately start typing after loading the page.
2020-05-10 16:51:42 +02:00
Ben Busby 130ac4532e Refactored handling of user config
Now implemented as a flask global variable reads from the same json file
as before, but doesn't crash if it does not find an existing file.

Removed user config creation from run script
2020-05-06 18:39:12 -06:00
Ben Busby d316fd77c6 Updated setup and routes for pipx compatibility 2020-05-06 18:13:02 -06:00
Ben Busby d01f56ea03 Removed referrer from links, refacored routes
Added <meta name="referrer" content="no-referrer"> to all whoogle
templates

Refactored search route to use conditionally use either request.args or
request.form, depending on rest call (get vs post respectively)
2020-05-05 18:28:43 -06:00
Ben Busby 708769f682 Minor styling refactor, updated app name 2020-05-04 18:00:43 -06:00
Ben Busby 0300eab6df Updated formatting and setup instructions
Switched encoding from utf-8 to unicode-escape in an effort to support multiple
languages besides English.

Updated image results page formatting to fix bad image links (added TODO
for adding full res image link for each image result).

Updated README to include libcurl and libssl install instructions for
manual setup.
2020-05-03 19:32:47 -06:00
Ben Busby 39c475af21 Using urlencode "doseq" option for url args 2020-04-29 20:31:03 -06:00