Commit Graph

467 Commits (6abe5511f42e6f5711a1ff6d963630ac42fdd800)

Author SHA1 Message Date
Ben Busby a6a97aa9c7
Catch failure to restore adv search state
Shouldn't throw any errors if this fails to be restored from local
storage for any reason. It's purely a nice-to-have feature.
2022-08-03 17:59:08 -06:00
Ben Busby cab1105169
Add an "advanced search" toggle in result tabs
Adds a new advanced search icon alongside the result tabs for switching
to a different country from the result page.

This will obviously get populated with other methods of filtering
results, but for now it's just the country selector.
2022-08-03 17:55:26 -06:00
Ben Busby 2eee0b87d5
Include full path when determining proxy host url
Session validation includes a method for determining the proxy host url,
but previously did not include the path for the initial request. This
caused a situation where users with a new session would not be able to
complete their first search, since the session validation follow-through
url did not include the actual path for their search query.

The method now includes a flag for only extracting the root url, which
is needed for creating full urls in the content filter.

Fixes #708
2022-08-02 10:57:59 -06:00
Ben Busby aa198ed562
Include leading slash in path replacement for result config changes 2022-08-01 14:35:43 -06:00
Ben Busby 3f363b0175
Allow temp region selection from result view
This adds a new "temporary" config section of the results view, where a
user can now change the country that their results come from without
changing their default config settings.

Closes #322
2022-08-01 14:32:24 -06:00
Ben Busby 73dd5b80b5
Remove google prefs link for mismatched language queries
Queries performed in a different language than what is configured
contain a result div that prompts the user to configure their language
preferences using google's preferences page.

Since we want all language configuration to occur on Whoogle only, we
can safely remove this result div.

Fixes #444
Fixes #386
2022-08-01 13:46:06 -06:00
Ben Busby 839683b4e1
Allow result navigation w/ Tab and Shift+Tab
Closes #457
2022-08-01 13:01:12 -06:00
Ben Busby 78614877f2
Fix redirect for misspelled queries starting with `/`
Fixes #818
2022-08-01 12:12:55 -06:00
Ben Busby bf92944b95
Support quora and imdb alts through Farside
Farside can now redirect quora links to querte instances and imdb links
to libremdb instances. This updates Whoogle to perform link replacements
for both services when site alts are configured.
2022-08-01 11:49:09 -06:00
Ben Busby fde2c4db1e
Only select default country in config if none are selected 2022-08-01 11:33:38 -06:00
Ben Busby a1adf60b30
PEP-8 fix 2022-07-13 10:31:55 -06:00
Ben Busby 5db72a9552
Use scheme in alt replacement if defined
For users running local instances of service alternatives such as
invidious, the alt replacement procedure broke if the scheme of the
original service (almost always https) didn't match the scheme of their
defined local service (likely http).

This adds a small check to see if the alt has a defined scheme, and if
so, removes the original scheme for that result.

Fixes #806
2022-07-13 10:25:51 -06:00
Kian-Meng Ang 2a8519be30
Fix typos [skip ci] (#813) 2022-07-13 10:08:44 -06:00
MadcowOG 03eeb3fad1
Strip newlines when parsing tor password (#801)
When parsing control.conf or password file, a newline character could cause
Authentication Errors.
2022-07-06 09:52:02 -06:00
Ben Busby f688b88bd8
Preserve wikipedia language setting for wikiless redirects
Wikipedia -> Wikiless redirects always result in an english language
result, even if the Wikipedia result would've been in a non-english
language. This is due to Wikipedia using language specific subdomains
(i.e. de.wikipedia.org, en.wikipedia.org, etc) whereas Wikiless uses a
"lang" url param.

This has been fixed by inspecting the subdomain of the wikipedia link
and passing that value to Wikiless as the lang param if it's determined
to be a language specific value (currently just looking for a 2-char
subdomain).

See #805
2022-07-06 09:49:43 -06:00
Marcell Fülöp ee2d3726af
Use X-Forwarded-Host as url_root when present (#799)
If Whoogle is accessed on a non-standard port _and_ proxied,
this port is lost to the application and `element['src']`s are
incorrectly formed (omitting port).

HTTP x-Forwarded-Host will contain this front port number in
a typical Nginx reverse proxy configuration.
2022-07-05 10:01:47 -06:00
Ben Busby cada4efe1d
Fix missing `os` import in routes 2022-06-27 12:36:45 -06:00
Joao A. Candido Ramos 0d2d5fff5d
Fixes handling of maps (#792)
* fixes map url, e.g. when no q parameter is given

* move maps_args from results to filter where it is used
2022-06-27 12:33:08 -06:00
jan Anja 90e160094d
Add more OpenSearch definitions (for images etc.) (#786) 2022-06-27 12:30:41 -06:00
CAB233 877785c3ca
Update Simplified Chinese translation (#794) 2022-06-24 10:52:27 -06:00
Joao A. Candido Ramos d05ec08abf
Remove wildcard imports (#791) 2022-06-24 10:51:15 -06:00
Joao A. Candido Ramos ddb8931e68
Fix image links not being opened in new tab (#790)
The majority of image links and links that are not handle by whoogle are not
opening in new tabs, this allow links that are not related to the application
to open in new tabs.
2022-06-24 10:50:14 -06:00
jan Anja 194b2eae74
Fix a crash with protected Tor control port (#785) 2022-06-22 10:23:58 -06:00
Ben Busby 966644baa0
Broaden session validation exception handling
Due to how instances installed with pip seem to have issues storing
unrelated files in the same directory as sessions, exception handling
during session validation has been expanded to blindly ignore all
exceptions. This portion of the code is more for maintainers of large
public instances with a bunch of users who block cookies anyways, so
having basic app functionality break down as a result shouldn't be the
default.
2022-06-16 15:46:18 -06:00
Ben Busby ddc73a53fe
Flip country config check in template
Country config value should be checked against the valid value when
updating the home page config, not the other way around. This can lead
to a state where a user sets up an invalid country value, but can still
be matched against a correct value that is part of the invalid value
(i.e. "countryUK" is invalid, but would match against the correct value,
"UK")

Also minor refactor of where the session file size validation occurs.
2022-06-16 12:11:23 -06:00
Ben Busby cb5557cc2e
Check file sizes in session dir before validation
For pip installed instances of Whoogle, there seems to be an issue where
files other than sessions are being stored in the same directory as the
sessions. From a brief investigation, this does not seem to be caused by
Whoogle, since Flask-Session objects are the only files stored in that
directory. It could be an issue with the library that is being used for
sessions, however.

Regardless, the app shouldn't crash when trying to validate and remove
invalid sessions, so a file size limit of 4KB was imposed during
validation. Any file found in the session directory that exceeds this
size limit will be ignored.

Fixes #777
Fixes #793
2022-06-16 11:50:13 -06:00
MadcowOG c9ee9dcc8b
Tor password authentication (#746)
Added password authentication for tor control port.

For user configuration of access to tor control port. This file should be
heavily restricted in file system.

Co-authored-by: MadcowOG <madcowog@Arch-Main.localdomain>
2022-06-16 11:05:41 -06:00
Ben Busby b03fe74f10
Ensure currency link parent exists before parsing
Fixes #782
2022-06-16 10:28:06 -06:00
Ben Busby d512745767
Bump version to 0.7.4 2022-06-13 10:37:28 -06:00
Ben Busby d51be4f529
Fix missing box shadow for light theme results
Related to 65796fd1a5

Fixes an issue where box shadows were missing for light theme results.
2022-06-10 08:58:31 -06:00
Ben Busby 35ac5ac82f
Fix autocomplete behavior on result page
Similar issue to #629, but the result page uses a different script for
handling user input, so the fix was not applied appropriately.

It has been fixed for this view now.
2022-06-09 16:40:49 -06:00
Ben Busby 65796fd1a5
Counter latest result page style changes
Google updated their styling of the result page, which broke some
components of Whoogle's result page styling (namely the result div
backgrounds for dark mode).

The GClasses class has been updated to keep track of what class names
have been updated to, and roll them back to a value that works for
Whoogle. A function was added that loops through new class names and
replaces them with their older counterparts.
2022-06-09 16:35:02 -06:00
Ben Busby a9e1f0d1bc
Refactor autocomplete/suggestion behavior (front-end only)
The previous implementation of autocomplete/suggestions on the front end
resulted in a situation where input and keydown events were constantly
being added to the search input bar. This was refactored to set up the
events only once and process suggestion navigation and appending
suggestions separately with different functions.

This has been tested on both an Android simulator, as well as an Android
tablet and seems to work as expected.

Fixes #370
Fixes #629
2022-06-07 11:01:14 -06:00
Ben Busby 47df4da4b5
Bump version to 0.7.3 2022-06-03 14:33:53 -06:00
Ben Busby f22e5ac171
Catch and ignore unpickling errors in pip installs
This seems to be caused by an odd behavior related to Flask sessions and
instances of Whoogle installed via pip. I didn't investigate it too
much, since catching and ignoring the result doesn't impact Whoogle
functionality at all (configuration and session values persist as
normal). Since this doesn't affect non-pip instances, I don't believe it
to be a fault within Whoogle itself.

Fixes #765
2022-06-03 14:29:57 -06:00
Ben Busby ef98d85dc5
Ensure searches with a leading slash are treated as queries
A user reported a bug where searches with a leading slash (in this case:
"/e/OS apps" were interpreted as a Google specific link when clicking
the next page of results.

This was due to the behavior that Google's search results exhibit, where
internal links for pages like support.google.com are delivered with
params like "?q=/support" rather than a direct link. This fixes that
scenario by checking the "q" param value against the user's original
query to ensure they don't match before assuming that the result is
intended as a redirect.

Fixes #776
2022-06-03 14:03:57 -06:00
Joao A. Candido Ramos fb6627a9cc
Remove duplicated handling of /url result links (#769)
It appears that result links beginning with '/url' were mistakenly
commited with an inefficient filtering process in its place. With the
way the code is structured, this less effective '/url' link filter took
precedence over the previous link filter, and also caused users with the
"open link in new tab" config enabled to no longer have access to that
feature.

Fixes #769
2022-05-25 11:37:34 -06:00
invis-z 9bcd9931f7
Replace leading slash for image links (#762)
The leading slash was previously removed without noticing it was part of a
string replacement in #734. This caused the href of "View Image" contain a
leading "/" which is wrong.
2022-05-25 11:18:17 -06:00
Ben Busby fb600d6fc8
Improve G page distinction between footer and results
Pages in the Whoogle footer that by default route to Google pages were
previously being removed, but caused results that also routed to similar
pages to no longer be accessible. This was due to the removal of the
'/url' endpoint that Google uses for each result.

To fix this, the result link is now parsed so that the domain of the
result can be checked against the disallowed G page list. Since results
are delivered in a "/url?q=<domain>" format -- even for pages to
Google's own products -- and the footer links are formatted as
"<product>.google.com", footer links are removed and result links are
parsed correctly.

Fixes #747
2022-05-16 09:53:48 -06:00
Ben Busby f5d599e7d2
Use `lax` for session `SameSite` value (not `strict`)
SESSION_COOKIE_SAMESITE must be set to 'lax' to allow the user's
previous session to persist when accessing the instance from an external
link. Setting this value to 'strict' causes Whoogle to revalidate a new
session, and fail, resulting in cookies being disabled.

This could be re-evaluated if Whoogle ever switches to client side
configuration instead.

Fixes #749
2022-05-10 17:40:58 -06:00
invis-z 0f6226ce51
Use `window` from Endpoint enum for anon view (#748)
Removes previously hardcoded "/window" from anon view links
2022-05-10 16:06:57 -06:00
hoschi1337 b809c88fa5
Fix german translation error (#742)
"Nachrichten" is the correct translation of "News"
2022-05-02 11:56:21 -06:00
xatier 7486697d41
Update zh-tw translation (#736) 2022-05-02 11:53:33 -06:00
invis-z b4d9f1f5e5
Remove "/" before endpoints & tags (#734)
Removes the leading slash before imgres and other endpoints

Fix #733
2022-04-27 14:25:14 -06:00
Ben Busby 8a0b872337
Bump version to 0.7.2 2022-04-26 16:49:30 -06:00
Ben Busby 2490089645
Remove unused `/url` endpoint
The `/url` endpoint was previously used as a way of mirroring the
`/url?q=<result domain>` formatting of locations in search results from
Google. Rather than have this unnecessary intermediary step, the result
path was extracted and used as the immediate path for each result item
instead.

This endpoint hasn't been in use for many versions and has been in need
of removal for quite some time.
2022-04-26 16:28:04 -06:00
Ben Busby 62d7491936
Only create ip card if main result div is found
The ip address card that is created for searches like "my ip" only needs
to be created/inserted if a main result div id is found.

Fixes #735
2022-04-26 15:18:29 -06:00
Ben Busby abc30d7da3
Render error message w/o `safe` filter
The error message shown in the error template does not need to be
rendered using the safe filter, and furthermore opens up an XSS
vulnerability.
2022-04-26 09:28:05 -06:00
Warren Spits d62ceb8423
Add proxyfix to honor `X-Forwarded-Proto` header (#731)
Fixes #730
2022-04-22 11:07:36 -06:00
Ben Busby a9b675cd24
Strip trailing slash on root url in filter
If a trailing slash is defined here, it causes the Whoogle instance to
redirect these element requests back to the home page, causing unwanted
behavior.
2022-04-20 14:55:19 -06:00
Ben Busby 5c8be4428b
Fall back to netloc for bang search if query is empty
Previously, empty bang searches would redirect to the Whoogle instance
home page. This now redirects to the specific site for the bang search
instead (i.e. "!yt" without a query redirects to "youtube.com", "!gh" to
"github.com", etc)

Fixes #719
2022-04-20 14:50:32 -06:00
Ben Busby 7688c1a233
Revert anon-view key change from #724
The "anon-view" translation key is the correct one to use for accessing
anonymous view within the search results. "config-anon-view" is only for
the configuration menu on the home page.
2022-04-20 14:11:29 -06:00
gdm85 6d362ca5c7
Add support for relative search results (#715)
* Relativization of search results

* Fix JavaScript error when opening images

* Replace single-letter logo and remove sign-in link

* Add `WHOOGLE_URL_PREFIX` env var to support relative path redirection

The `WHOOGLE_URL_PREFIX` var can now be set to fix internal app
redirects, such as the `/session` redirect performed on the first visit
to the Whoogle home page.

Co-authored-by: Ben Busby <contact@benbusby.com>
2022-04-18 15:27:45 -06:00
gdm85 94b4eb08a2
Return 401 when token is invalid (#714)
In some rare instances (a race condition perhaps?) a
`cryptography.fernet.InvalidToken` exception is thrown resulting in
a broken connection.

This change gracefully returns a 401 error instead.
2022-04-18 13:06:44 -06:00
Ilya Prokopenko cded1e0272
Fix Russian translation (#726) 2022-04-18 12:46:02 -06:00
glitsj16 ca80bb0caa
Fix 'anon-view' KeyError (#724) 2022-04-18 12:45:20 -06:00
Ben Busby 9317d9217f
Support proxying results through Whoogle (aka "anonymous view") (#682)
* Expand `/window` endpoint to behave like a proxy

The `/window` endpoint was previously used as a type of proxy, but only
for removing Javascript from the result page. This expands the existing
functionality to allow users to proxy search result pages (with or without
Javascript) through their Whoogle instance.

* Implement filtering of remote content from css

* Condense NoJS feature into Anonymous View

Enabling NoJS now removes Javascript from the Anonymous View, rather
than creating a separate option.

* Exclude 'data:' urls from filter, add translations

The 'data:' url must be allowed in results to view certain elements on
the page, such as stars for review based results.

Add translations for the remaining languages.

* Add cssutils to requirements
2022-04-13 11:29:07 -06:00
gdm85 739a5092cc
Do not offer opensearch.xml as attachment (#713)
Sending opensearch.xml as an attachment is unnecessary. 

This will also allow inspecting the XML file via browser without downloading
it.
2022-04-07 13:52:17 -06:00
Ben Busby 2fcfeacd44
Reduce search bar font size on mobile
24px->20px

Fixes #477
2022-04-06 14:44:17 -06:00
Ben Busby 0e5630f33a
Add ability to listen on unix sockets
Introduces a way to tell the app to listen on unix socket instead of
host:port.

Fixes #436
2022-04-06 14:11:52 -06:00
Ben Busby 797372ecaa
Ignore blank alts if site alt config is enabled
If the alt for a particular service is blank, the original source is
used instead.

Example:
1. Site alts enabled in config
2. User wants wikipedia links, not wikiless
3. WHOOGLE_ALT_WIKI set to ""
4. All available alt links redirected to farside, except wikipedia

Fixes #704
2022-03-30 14:46:33 -06:00
green1052 0d6901aaa2
Add korean translation (#700) 2022-03-28 10:11:57 -06:00
138138138 5ecd4fe931
Add "nofollow noopener noreferrer" to all links (#698)
Old iOS 12 devices will pass the Referer HTTP header to the site user clicks.
Websites will know those traffic come from Whoogle search.
Adding "nofollow noopener noreferrer" solves the issue.
2022-03-28 10:11:09 -06:00
xatier e575fad324
Fix incorrect translation (zh-TW & zh-CN) (#697)
Translation for `maps` and `videos` were swapped in this commit.

11099f7b1d (diff-fcd1e088df6519cbd45d012f89a0d2722b7414c94189ee41595a3a101b4c11ad)
2022-03-28 10:10:18 -06:00
Ben Busby f5c47234de
Fix time filter background color
The time filter (past day/hour/month/etc) was using the result element
background color instead of the page background color, which wasn't
providing enough contrast with the default text color.
2022-03-25 12:14:57 -06:00
Ben Busby 0048c2f9aa
Update remaining alternative frontends to use Farside
Wikipedia, imgur, and translate alternatives were all still using
hardcoded URLs when replaced with their respective alternative frontend.
This updates them to use farside instead.
2022-03-21 10:08:52 -06:00
Ben Busby a58f70ca7e
Fix wikipedia->wikiless domain replacement
Was previously using wikipedia.com not wikipedia.org, causing wikiless
replacements to not occur.

Fixes #686
2022-03-21 10:01:21 -06:00
Ben Busby 2a0ad8796c
Switch to defusedxml for xml parsing
xml.etree.ElementTree.fromstring is considered insecure, see:
https://docs.python.org/3/library/xml.etree.elementtree.html

The defusedxml package contains several Python-only workarounds and
fixes for denial of service and other vulnerabilities in Python's XML
libraries: https://github.com/tiran/defusedxml

Fixes #670
2022-03-01 12:54:32 -07:00
Ben Busby f7e3650728
Only remove G links in footer
Links that were directed at G domains were previously removed
universally, when really they only needed to be removed from the footer
to reduce possible confusion caused by mixed Whoogle and G links.

Fixes #656
2022-03-01 12:48:33 -07:00
Ben Busby 69f845a047
Add test for empty bang behavior
Also fix pep8 issue
2022-03-01 12:13:40 -07:00
Ben Busby 809520ec70
Fallback to home page for empty bang searches
Bang searches without an actual query (i.e. just searching "!gh") will
now redirect to the home page. I guess people do this for some reason
and don't like that it redirects to the correct bang result URL, but
without an actual search term.

Fixes #595
2022-03-01 12:06:59 -07:00
Ben Busby b28fa86e33
Update ad filter
Recent changes to ads in search results caused Whoogle to display ads
for certain searches. In particular, ads recently started appearing
grouped into one div, as opposed to a singular ad per div. This was
accompanied by the div label "ads" (instead of just "ad"), which threw
off the existing ad filter. The ad keyword blacklist has been updated
accordingly, and has been enhanced to only check against alpha chars for
each label.

This only seems to have affected English language searches, and only for
very specific searches.
2022-02-25 23:02:58 -07:00
Ben Busby 9984158ec1
Ensure valid str->float conv in currency calc
Currency amounts returned by google seem to randomly include unicode
chars ('\xa0' noted in #642) which broke the currency calculator
included in the project. This ensures that only strings that can be
converted to float are ever used in the conversion.

Fixes #642
2022-02-17 16:33:44 -07:00
Nitish Yadav 0e711beca7
Give `Accept-Language` div its own class (#659)
Fixes accidental assignment of "get-only" class to the
"Accept-Language" config option
2022-02-16 09:23:38 -07:00
Ben Busby 23402e27e1
Check for updates using 24 hour time delta
Rather than only checking for an available update on app init, the check
for updates now performs the check once every 24 hours on the first
request sent after that period.

This also now catches the requests.exceptions.ConnectionError that is
thrown if the app is initialized without an active internet connection.

Fixes #649
2022-02-14 12:19:02 -07:00
Ben Busby d33e8241dc
Fix "my ip" search regression
Removes dependency on class names for creating the "my ip" info card in
the results list for searches pertaining to the user's public IP.

Adds test to prevent this from happening again.

Note to anyone reading this and looking to contribute: please avoid
using hardcoded class names at all costs. This approach of
creating/removing content just results in issues if/when Google decides
to introduce/remove class names from the result page.

Fixes #657
2022-02-14 11:40:11 -07:00
DUO Labs b2c048af92
Fix `collapse_sections` for `MINIMAL_MODE` (#654) 2022-02-11 14:44:08 -07:00
DUO Labs 7c5094d37b
Check for soup body in `remove_site_blocks` (#651)
Fixes error with `remove_site_blocks` in the Images tab
2022-02-11 14:42:11 -07:00
DUO Labs 502067addc
Clean "Show more results" of all site blocks (#646) 2022-02-08 10:57:00 -07:00
Joao A. Candido Ramos 11099f7b1d
Use consistent header for all result types (#535)
Introduces a header for switching between result types (i.e. "All", "News",
etc) that is consistent between the different result types. Previously, image
results had a tab header that was formatted in a drastically different manner,
which was jarring when switching from a different result page to the Images
page.

Created a G class enum to reference class names returned in search
results. As noted in the class doc, this should only be used/updated as
a last resort, as class names change frequently. For some instances,
such as replacing the tbm tab, it's a lot easier to just replace by
header name than attempting to replace it based on how the element is
structured.

Also updated a few styles to revert the latest styling changes being
applied by Google.

Co-authored-by: jacr13 <ramos.joao@protonmail.com>
Co-authored-by: Ben Busby <contact@benbusby.com>
2022-02-07 10:47:25 -07:00
සයුරි | Sayuri 4aa94a5d75
Fix Sinhala translation for farside search (#594) 2022-02-04 16:16:56 -07:00
DUO Labs 500942cb99
Update minimal mode for new Google formatting (#637)
Google's latest formatting changes broke the modifications made when enabling
`WHOOGLE_MINIMAL`. This updates the result filtering to work with the new
changes.

Fixes #634
2022-02-02 12:57:05 -07:00
Ben Busby b393e68d1d
Fix incorrect min-width for mobile screen sizes
min-width was previously set to 736px for all screen sizes, which forced
content off screen for smaller devices such as mobile phones. This
modifies the search stylesheet to only apply a min-width style to
devices > 800px wide.
2022-02-01 20:36:53 -07:00
Ben Busby e3394e29dd
Amend body width formatting in search css
`min-width` is a better field to override than `max-width`, since some
users prefer full width results.
2022-02-01 17:24:12 -07:00
Ben Busby 9ba73331aa
Override new Google search result formatting
There have been some recent formatting changes made by Google for search
results that do not look good (especially for dark themes). This
mostly overrides those styles to resemble the original Whoogle
result formatting.
2022-02-01 17:15:48 -07:00
Ben Busby 33f56bb0cb
Read `WHOOGLE_CONFIG_DISABLE` var as bool in app init
Fixes #636, which pointed out that the var was being interpreted as
"active" (config hidden) regardless of the value that was set.
2022-02-01 15:29:22 -07:00
Ben Busby 1af4566991
Bump version to 0.7.1 2022-01-26 10:41:41 -07:00
Ben Busby 863cbb2b8d
Remove trailing whitespace 2022-01-25 12:31:19 -07:00
Ben Busby 72e5a227c8
Move bangs init to bg thread
Initializing the DDG bangs when running whoogle for the first time
creates an indeterminate amount of delay before the app becomes usable,
which makes usability tests (particularly w/ Docker) unreliable. This
moves the bang json init to a background thread and writes a temporary
empty dict to the bangs json file until the full bangs json can be used.
2022-01-25 12:28:06 -07:00
Nitish Yadav fc50359752
Improve formatting of collapsible infobox (#612) 2022-01-18 13:47:35 -07:00
DUO Labs 257e3f33ef
Skip loading autocomplete.js if `WHOOGLE_AUTOCOMPLETE=0` (#611)
Bypasses autocomplete.js if `WHOOGLE_AUTOCOMPLETE` is set to 0
2022-01-18 13:39:56 -07:00
DUO Labs 74cb48086c
Introduce site alts for imgur and wikipedia (#609)
* Add `WHOOGLE_ALT_IMG` for a replacement for imgur.

* Add `WHOOGLE_ALT_WIKI` for Wikipedia
2022-01-14 09:59:03 -07:00
Ben Busby ded787547a
Exclude opensearch route from session validation
Fixes #588
2022-01-11 10:50:35 -07:00
Ben Busby f4b65be876
Catch invalid XML in suggestion response
As reported in #593, the XML response body returned for search
suggestions can apparently contain invalid XML elements. This catches
the error and returns an empty suggestion list instead of erroring.

Fixes #593
2021-12-28 11:38:18 -07:00
Ben Busby 8c92b381a2
Remove default country param
The country URL param ('gl') is no longer set to 'US' by default, and is
omitted from the search entirely unless explicitly set by the user. This
change was made in an attempt to cut back on the number of captchas
experienced by certain users self-hosting who experienced a decreased
amount of captchas when this configuration setting was removed.

Fixes #558
2021-12-23 17:01:49 -07:00
Ben Busby d02a7d90b9
Use UTF-8 encoding when loading json files
Fixes #581
2021-12-21 14:11:55 -07:00
Ben Busby 6d9df65d02
Catch `FileNotFound` when clearing invalid sessions
The server now consumes the FNF error if an invalid session is found but
is deleted in an earlier thread.

Fixes #577
2021-12-21 14:03:24 -07:00
Ben Busby 3d8da1db58
Bump version to 0.7.0 2021-12-08 17:57:22 -07:00
Ben Busby 634d179568
Use farside.link for frontend alternatives in results (#560)
* Integrate Farside into Whoogle

When instances are ratelimited (when a captcha is returned instead of
the user's search results) the user can now hop to a new instance via
Farside, a new backend service that redirects users to working instances
of a particular frontend. In this case, it presents a user with a
Farside link to a new Whoogle (or Searx) instance instead, so that the
user can resume their search.

For the generated Farside->Whoogle link, the generated link includes the
user's current Whoogle configuration settings as URL params, to ensure a
more seamless transition between instances. This doesn't translate to
the Farside->Searx link, but potentially could with some changes.

* Expand conversion of config<->url params

Config settings can now be translated to and from URL params using a
predetermined set of "safe" keys (i.e. config settings that easily
translate to URL params).

* Allow jumping instances via Farside when ratelimited

When instances are ratelimited (when a captcha is returned instead of
the user's search results) the user can now hop to a new instance via
Farside, a new backend service that redirects users to working instances
of a particular frontend. In this case, it presents a user with a
Farside link to a new Whoogle (or Searx) instance instead, so that the
user can resume their search.

For the generated Farside->Whoogle link, the generated link includes the
user's current Whoogle configuration settings as URL params, to ensure a
more seamless transition between instances. This doesn't translate to
the Farside->Searx link, but potentially could with some changes.

Closes #554

Closes #559
2021-12-08 17:27:33 -07:00
Vansh Comar 7bea6349a0
Add tools for currency conversion in search results (#536)
This implements a method for converting between various currencies. When a user
searches "<currency A> to <currency B>" (including when prefixed by a specific
amount), they are now presented with a table for quickly converting between the
two. This makes use of the currency ratio returned as the first "card" in
currency related searches, and the table is inserted into this same card.
2021-12-06 22:56:13 -07:00
Ben Busby 10a15e06e1
Fix incorrect request type for image searches
Previously had hardcoded POST requests for all requests that didn't use
the header template (which currently is only the image tab).

Also refactored how the Filter class works. It now requires a valid
Config model to be provided, which is then set up as a class var that
the filtering functions can use as needed, rather than setting specific
values from the config as individual values (which was confusing and
sloppy).

Fixes #561
2021-12-06 21:39:50 -07:00
Ben Busby b75ff0782d
pep8: fix CSP header line length 2021-11-29 15:58:19 -07:00
Ben Busby 3e20788857
Disable in-app CSP unless enabled via WHOOGLE_CSP
The default CSP is only helpful for some, and can break instances for
others. Since these aren't always necessary and are occasionally set by
the user's preferred reverse proxy, it is being disabled unless
explicitly enabled by setting `WHOOGLE_CSP`.

Fixes #493
2021-11-29 15:52:28 -07:00
Ben Busby f73e4b9239
Fix height for homepage logo 2021-11-29 15:34:13 -07:00
Ben Busby 27051363ff
Adjust logo css for mobile devices
Fixes #557
2021-11-27 20:03:06 -07:00
Ben Busby 9c96f0fd57
Improve default response headers
Reponse headers now include the following:
- X-Content-Type-Options: nosniff
- X-Frame-Options: DENY
- Strict-Transport-Security: max-age=63072000
  - Only when HTTPS_ONLY is set

https://infosec.mozilla.org/guidelines/web_security#http-strict-transport-security
https://infosec.mozilla.org/guidelines/web_security#x-content-type-options
https://infosec.mozilla.org/guidelines/web_security#x-frame-options
2021-11-26 08:38:26 -07:00
Ben Busby 73f631b1f9
Import logo stylesheet before applying custom css
This fixes #551, and allows custom css to be applied to the Whoogle
logo.
2021-11-24 12:38:56 -07:00
Ben Busby 3c06519130
Use 'gl' search param to set country
This switches the param used for the "country" config setting from "cr"
(which only filters results by the country the result is hosted in) to
"gl" (which overrides server/hosting location and produces results that
are more accurate for the user's current country).

Before this change, the country config setting was (imo) pretty useless.
Allowing a user to override an instance's hosting location with their
preferred country though is way more useful, especially for public
instances that are hosted in a different country than the user.

Closes #544
2021-11-23 13:48:54 -07:00
Ben Busby 1d3e7c0255
Pin config buttons to bottom of config menu
Previously the load/save/apply buttons in the config menu were hidden
below all available config options and required the user to scroll to
the bottom to save changes. This made for bad ux, since for new users,
it isn't immediately apparent that selecting a new dropdown value, for
instance, doesn't instantly save the new setting. The new layout should
make it more clear that hitting "Apply" is required to save config
changes.
2021-11-23 12:27:59 -07:00
Ilya Prokopenko 79a4a17311
Add Russian translation (#552) 2021-11-23 10:36:52 -07:00
Ben Busby 5a27d748d1
Create separate test workflow for docker
This expands on the current testing suite a bit by introducing a new
workflow for testing functionality within the docker container. It runs
the same test suite as the regular "test" workflow, but also performs a
health check after running the app for 10 seconds to ensure
functionality.

The buildx workflow now waits for the docker test script to finish
successfully, rather than the regular test workflow. This will hopefully
avoid situations where new images are pushed with issues that aren't
detected in regular testing of the app.
2021-11-22 00:26:25 -07:00
Ben Busby 6f5f3d8ca7
Fix incorrect redirect protocol used by Flask
Flask's `request.url` uses `http` as the protocol, which breaks
instances that enforce `https`, since the session redirect relies on
`request.url` for the follow-through URL.

This introduces a new method for determining the correct URL to use for
these redirects by automatically replacing the protocol with `https` if
the `HTTPS_ONLY` env var is set for that instance.

Fixes #538

Fixes #545
2021-11-21 23:21:04 -07:00
Ben Busby 0c5578937e
Remove 308 redirect for http->https
HTTPS upgrades should be handled outside of Whoogle, since Flask often
doesn't detect the right protocol when being used behind a reverse proxy
such as Nginx.
2021-11-20 16:43:57 -07:00
Ben Busby de28e06d8f
Improve cookie security when `HTTPS_ONLY` is set
Adds the "Secure" flag and "__Secure-" prefix if the `HTTPS_ONLY`
environment variable is enabled.

Fixes #539
2021-11-20 16:34:37 -07:00
Ben Busby e06ff85579
Improve public instance session management (#480)
This introduces a new approach to handling user sessions, which should
allow for users to set more reliable config settings on public instances.

Previously, when a user with cookies disabled would update their config,
this would modify the app's default config file, which would in turn
cause new users to inherit these settings when visiting the app for the
first time and cause users to inherit these settings when their current
session cookie expired (which was after 30 days by default I believe).
There was also some half-baked logic for determining on the backend
whether or not a user had cookies disabled, which lead to some issues
with out of control session file creation by Flask.

Now, when a user visits the site, their initial request is forwarded to
a session/<session id> endpoint, and during that subsequent request
their current session id is matched against the one found in the url. If
the ids match, the user has cookies enabled. If not, their original
request is modified with a 'cookies_disabled' query param that tells
Flask not to bother trying to set up a new session for that user, and
instead just use the app's fallback Fernet key for encryption and the
default config.

Since attempting to create a session for a user with cookies disabled
creates a new session file, there is now also a clean-up routine included
in the new session decorator, which will remove all sessions that don't
include a valid key in the dict. NOTE!!! This means that current user
sessions on public instances will be cleared once this update is merged
in. In the long run that's a good thing though, since this will allow session
mgmt to be a lot more reliable overall for users regardless of their cookie
preference.

Individual user sessions still use a unique Fernet key for encrypting queries,
but users with cookies disabled will use the default app key for encryption
and decryption.

Sessions are also now (semi)permanent and have a lifetime of 1 year.
2021-11-17 19:35:30 -07:00
Joao A. Candido Ramos 1f18e505ab
Include "chips" param in image search (#534)
"chips" is used in image tabs to pass the optional "filter" to add to the
given search term

Fixes #299
2021-11-17 16:17:27 -07:00
Ben Busby e93507f148
Catch connection error during Tor validation step
Validation of the Tor connection occasionally fails with a
ConnectionError from requests, which was previously uncaught. This is
now handled appropriately (error message shown and connection dropped).

Fixes #532
2021-11-12 17:19:45 -07:00
gnuhead-chieb 3f40a6c485
Add Japanese translation (#528) 2021-11-09 08:37:49 -07:00
Fabian Schilling 9ad1d60a47
Improve URL parsing for full size images (#521)
Skip URLs that are not two-element lists

Fixes #520
2021-11-02 16:22:24 -06:00
Vansh Comar 3784d897d9
Add "update available" indicator to footer (#517)
This checks the latest released version of Whoogle against
the current app version, and shows an "update available"
message if the current version num < latest release num.

Closes #305
2021-11-02 10:35:40 -06:00
Ben Busby b73c14c7cc
Set max height for config menu
The config menu has gotten out of control recently, but rather than
reducing functionality, I'm just going to set a max height for the div
and allow scrolling within the menu.

Ultimately though this indicates that the app is getting a bit too
complicated (imo). Striking a balance between customization and
minimalism is less of a priority for me nowadays though, hence why I'm
willing to let it slide for now. At some point, maybe when there are
more contributors, it could be nice to refactor this in some way so that
it isn't overwhelming to new users who are looking to customize their
instance (that's just me speculating btw, I haven't actually heard from
anyone who thinks there are too many options in that menu).
2021-11-01 16:55:33 -06:00
Ben Busby c766554eea
Bang refactor PEP-8 fix
Addresses PEP-8 formatting issue in previous commit
2021-11-01 16:53:19 -06:00
Ben Busby ddf951de35
Use `replace` in bang query formatting
Using `format` for formatting bang queries caused a KeyError for some
searches, such as !hd (HUDOC). In that example, the URL returned in the
bangs json was `http://...#{%22fulltext%22:[%22{}%22]...`, where
standard formatting would not work due to the misidentification of
"fulltext" as a formatting key.

The logic has been updated to just replace the first occurence of "{}"
in the URL returned by the bangs dict.

Fixes #513
2021-11-01 16:47:48 -06:00
gripped d1c9b7f803
Remove styling from NoJS liks (#511)
Fixes #510
2021-11-01 16:03:47 -06:00
Ben Busby 7fe066b4ea
Escape result html after bolding search terms
Fixes #518
2021-11-01 15:35:57 -06:00
gripped c2ced23073
Improve formatting with NoJS enabled (#509)
Removes line breaks, divider, and link location from all NoJS
links in results when NoJS mode is enabled
2021-10-29 09:28:05 -06:00
Ben Busby 0a78c524fa
Expand 'my ip' to work for proxied requests
Adds a check for the HTTP_X_FORWARDED_FOR header, and uses the value
from the request if found.
2021-10-28 21:31:24 -06:00
Ben Busby 26b560da1d
Pass response as str to bsoup for "my ip" card
Due to how the response is now reformed into a new bsoup object when
bolding search query terms, creating an ip card for "my ip" searches
threw an error due to how the new bsoup object was initialized for the
"my ip" card. This passes the response in as a string instead.

Fixes #504
2021-10-28 21:22:51 -06:00
Ben Busby cad1e2ab4d
Include translation mapping in nojs windows
The translation map was missing for links opened via the nojs feature,
causing a server error.

Fixes #507
2021-10-28 21:06:52 -06:00
DUO Labs 5189cdb072
Update "skip bolding" regex to fix some edge cases (#500)
Should address errors caused by the "bold query" feature replacing
tags and style elements, resulting in unformatted response pages.
2021-10-28 12:54:27 -06:00
Vansh Comar f04c7c5557
Support DDG style bangs with bang at the end (#503)
DDG style bang searches can now have the bang (!) at the end of
the search (i.e. "bologna w!" will now redirect to wikipedia just like
"bologna !w" would)
2021-10-28 12:39:33 -06:00
Ben Busby 190b684469
Reformat view templates 2021-10-27 12:30:55 -06:00
Ben Busby b96e3a0acb
Make base search url a member of the request class
Since the request class is loaded prior to values being read from the
user's dotenv, the WHOOGLE_RESULT_PER_PAGE var wasn't being used for
searches.

This moves the definition of the base search url to be intialized in the
request class to address this issue.

Fixes #497
2021-10-27 11:02:14 -06:00
DUO Labs d8dcdc7455
Skip bolding search terms that are not alphanumeric (#496)
Fixes #494
2021-10-27 10:50:21 -06:00
Ben Busby 1abd040428
Remove redundant loading of variables.css
variables.css doesn't need to be loaded by any template, since
WHOOGLE_CONFIG_STYLE loads those values by default when not set
explicitly. Loading the stylesheet caused the logo colors to be
persistent unless set individually.

Sorry @gripped for sneaking all of this unnecessary color in...

Fixes #492
2021-10-26 21:11:46 -06:00
Ben Busby 591ed4a6d6
Use f-string in bold query regex
by @DUOLabs333
2021-10-26 16:21:30 -06:00
Ben Busby f154b5f2e2
PEP-8 formatting fix 2021-10-26 16:17:38 -06:00
Ben Busby 6decab5a51
Improve regex for bolding search terms
Co-authored by @DUOLabs333
2021-10-26 16:15:24 -06:00
Ben Busby d16ef6d011
Unescape search response before rendering template
Fixes a small issue with the previous commit where bolded search terms
had the <b> tags escaped, rather than being applied as actual html.
2021-10-26 15:00:39 -06:00
DUO Labs 2c9cf3ecc6
Bold search query in results (#487)
This modifies the search result page by bold-ing all appearances
of any word in the original query. If portions of the query are in
quotes (i.e. "ice cream"), only exact matches of the sequence of
words will be made bold.

Co-authored-by: Ben Busby <noreply+git@benbusby.com>
2021-10-26 14:59:23 -06:00
Ben Busby 90441b2668
Add WHOOGLE_MINIMAL to docs, tweak min mode logic
Activating minimal mode should also remove all collapsed sections, if
any are found.

WHOOGLE_MINIMAL now documented in readme and app.json (for heroku).
2021-10-26 10:38:20 -06:00
DUO Labs 543f2b2a01
Add a "minimal mode" for condensing results (#485)
If WHOOGLE_MINIMAL is set, all non-link results are
removed from the view.
2021-10-26 10:35:12 -06:00
DUO Labs 5a05bfb6de
Allow setting number of results per page (#486)
Add `WHOOGLE_RESULTS_PER_PAGE` var, allowing users to 
specify the number of results per page. The default is 10.
2021-10-26 10:28:38 -06:00
Vansh Comar 5118ddb8b8
Allow setting "Accept-Language" header (#483)
Closes #445
2021-10-25 15:49:09 -06:00
Ben Busby 91002ec6be
Update default theme css
I've gotten a bit bored of the current light/dark themes, so I'm
switching the default theme over to the Doppelganger theme, which is a
better template/jumping off point for users to use when creating custom
themes since it also provides examples for coloring each of the Whoogle
logo letters.
2021-10-23 23:56:38 -06:00
Ben Busby 8f70236403
Update domains used for scribe.rip replacements
The levelup.gitconnected.com site is a Medium site that can also be
replaced with scribe.rip whenever privacy respecting site alternatives
are enabled in the config.

Also modified how link descriptions are updated when that config is
enabled (before it was missing replacements on quite a few
descriptions).
2021-10-23 23:23:37 -06:00
Vansh Comar 771bf34ce9
Show client IP for "my ip" searches (#469)
This introduces a new UI element for displaying the client IP
address when a search for "my ip" is used.

Note that this does not show the IP address seen by Google
if Whoogle is deployed remotely. It uses `request.remote_addr`
to display the client IP address in the UI, not the actual address
of the server (which is what Google sees in requests sent from
remote Whoogle instances).
2021-10-21 10:42:31 -06:00
Yadomin 284a8102c8
Block by result title or url using regex (#473)
Allows blocking search results using a regex filter for either
result title or result url
2021-10-20 20:01:04 -06:00
Vansh Comar 79fb7531be
Implement scribe.rip replacement for medium.com results (#463)
scribe.rip is a privacy respecting front end for medium.com. This
feature allows medium.com results to be replaced with scribe.rip links,
and works for both regular medium.com domains as well as user specific
subdomains (i.e. user.medium.com).

[scribe.rip website](https://scribe.rip)
[scribe.rip source code](https://git.sr.ht/~edwardloveall/scribe)

Co-authored-by: Ben Busby <noreply+git@benbusby.com>
2021-10-16 12:22:00 -06:00
Ben Busby ee6a27e541
Add link to user css themes in config menu 2021-10-14 20:20:12 -06:00