The wget method seemed to have a possible issue with creating endless
index.html copies (despite being specified to output to console only),
so this has been updated to use curl instead.
Also uses new non-authenticated "healthz" route to perform the
healthcheck.
Fix#316Fix#313
The previous method of removing all site filters from the search query
removed the last letter of the search. This only applies the substring
filter if any site filters are present in the query.
Fixes#306
* Block websites in search results via user config
Adds a new config field "Block" to specify a comma separated list of
websites to block in search results. This is applied for all searches.
* Add test for blocking sites from search results
* Document WHOOGLE_CONFIG_BLOCK usage
* Strip '-site:' filters from query in header template
The 'behind the scenes' site filter applied for blocked sites was
appearing in the query field when navigating between search categories
(all -> images -> news, etc). This prevents the filter from appearing in
all except "images", since the image category uses a separate header.
This should eventually be addressed when the image page can begin using
the standard whoogle header, but until then, the filter will still
appear for image searches.
* Add option to disable changing of configuration
Introduces a test to ensure the correct response code is found when
attempting to update the config when disabled, and ensure default config
is unchanged when posting a new config dict.
Attempting to update the config using the API when disabled now returns
a 403 code + redirect.
Co-authored-by: Ben Busby <benbusby@protonmail.com>
The search language is now set using the WHOOGLE_CONFIG_SEARCH_LANGUAGE
environment variable. Interface language is still set using
WHOOGLE_CONFIG_LANGUAGE.
Fixes#260
Enforces 0 margin for the search input form on the result page, which
removes the weird gap that is seen by default.
Also made minor changes to the border styling. Desktop searches now have
a single bottom border in dark mode rather than an all around border,
and the border around the mobile search result input was removed
entirely.
This was unfortunately a bit more complex than just adding an HTML reset
button, since reset buttons only "reset" input content to its original
value rather than clearing it. This doesn't work for Whoogle's needs,
since inputs on search result pages are auto populated with the search
content as their default value.
A reset button was introduced anyways, but is controlled by a few lines
of javascript to allow completely clearing the search input. The button
will only appear on mobile searches.
At the moment, it isn't particularly pretty, but is functional. It uses
just a plain "x" character and is always visible on mobile search result
pages. This leaves plenty of room for improvement moving forward.
Fixes#291
The recent change to cast bool config vars as ints to handle a '0' or
'1' value was shortsighted, since it doesn't allow for instances where
the variable is set to an empty value (or '' or any invalid/non-int
value).
This introduces a read_config_bool method for reading values that should
be a '0' or '1', but will default to False if not a digit (otherwise the
value will be cast as bool(int(value)) if "value" is a digit str).
Fixes#288
Config boolean environment variables need to be cast to ints, since
they are set or unset using 0 and 1. Previously they were interpreted as
(pseudocode) read_var(name, default=False), which meant that setting
CONFIG_VAR=0 would enable that variable since Python reads environment
variables as strings, and '0' is truthy. This updates the previous logic
to (still pseudocode) int(read_var(name, default='0')).
Fixes#279
Both light and dark themes have been updated to remove the leftover
hardcoded values (mostly related to the search suggestion styling).
See discussion in #247.
The logging from imported modules (stem, in particular) has caused quite
a few users to assume there are errors where there aren't any. The logs
from stem also aren't helpful, as everything in the library works as
expected despite the implication from the logs that it is not working.
Randomizing the "Mozilla" portion of the user agent changed the
character encoding to GB2312. Setting it to plain "Mozilla" enforces
UTF-8 encoding.
Bump to version 0.4.1 for release of bug fix
Fixes#267
This moves away from the previous (messy) approach of using two separate
keys for decrypting text and element URLs separately and regenerating
them for new searches. The current implementation of sessions is not very
reliable, which lead to keys being regenerated too soon, which would
break page navigation. Until that can be addressed, the single
key per session approach should work a lot better.
Fixes#250Fixes#90
The previous implementation of the is_heroku check in
search.needs_https() was implemented to only match URLs ending in
'.herokuapp.com', and skipped upgrading to HTTPS for other endpoints.
This introduces a set of environment variables that can be used for
defining initial config state, to expedite the process of
destroying/relaunching instances quickly with the same settings every
time.
Closes#228Closes#195
This allows the user to enable their preferred settings in a variety of
ways, depending on their deployment preference. Values added to
whoogle.env can be enabled using WHOOGLE_DOTENV=1, in which case all
values in the env var file will overwrite defaults or user provided
settings.
Co-authored-by: Ben Busby <benbusby@protonmail.com>
Eventually this should be part of a separate mypy ci build, but right
now it's just a general guideline. Future commits and PRs should be
validated for static typing wherever possible.
For reference, the testing commands used for this commit were:
mypy --ignore-missing-imports --pretty --disallow-untyped-calls app/
mypy --ignore-missing-imports --pretty --disallow-untyped-calls test/
* Add custom CSS field to config
This allows users to set/customize an instance's theme and appearance to
their liking. The config CSS field is prepopulated with all default CSS
variable values to allow quick editing.
Note that this can be somewhat of a "footgun" if someone updates the
CSS to hide all fields/search/etc. Should probably add some sort of
bandaid "admin" feature for public instances to employ until the whole
cookie/session issue is investigated further.
* Symlink all app static files to test dir
* Refactor app/misc/*.json -> app/static/settings/*.json
The country/language json files are used for user config settings, so
the "misc" name didn't really make sense. Also moved these to the static
folder to make testing easier.
* Fix light theme variables in dark theme css
* Minor style tweaking
The app/utils/*_utils weren't named very well, and all have been updated
to have more accurate names.
Function and class documention for the utils have been updated as well,
as part of the effort to improve overall documentation for the project.
Introduces a new content security policy header for responses to all
requests to reduce the possibility of ip leaks to outside connections.
By default blocks all inline scripts, and only allows content loaded
from Whoogle.
Refactors a few small inline scripting cases in the project to their own
individual scripts.
Requiring authentication for accessing the opensearch template prevents
the browser from accessing the file when adding as a default search
engine. This removes the authentication requirement from the opensearch
route, which should never provide any sensitive information anyways.
Bang operator can now be placed anywhere in the query, to allow for peak
efficiency in stream of consciousness querying (i.e. `big !reddit
chungus` will search reddit for big chungus`).
Fixes#196
* Adds the ability to redirect reddit.com to libredd.it using the existing
"site alts" config setting.
This adds the WHOOGLE_ALT_RD environment variable for optionally
redirecting reddit links to libreddit
(https://github.com/spikecodes/libreddit).
* Include libreddit in home page site alt note
Pip installs of whoogle search were missing access to the misc/ folder,
which previously contained the language and country json files. These
have been moved to app/misc, and the previous root level misc/ was
renamed to config/ (since it now only contains the tor config files).
Bump to 0.3.1.
Heroku instances were using the base http url when formatting the
opensearch.xml template. This adds a new routing utility, "needs_https",
which can be used for determining if the url in question needs
upgrading.
With javascript disabled, searches could not be submitted on the results
page using the "Enter" key. Adding a hidden submit button to the header
template resolves this issue.
The lxml dependency in the project was fairly unnecessary, and made the
initial build time for the project considerably slower. This replaces
all instances of lxml with either the default html parser (for bs4
constructors) or the built in xml.etree package (for search suggestion
parsing).