Commit Graph

136 Commits (2114cb87c060a6244fbd3d5aad4634413de34c4e)

Author SHA1 Message Date
Joao A. Candido Ramos bf4bf1ff2c
Split interface and results language config (#89)
Adding support to choose separately the language of search and the one for the interface (allowing a default givent by google).

Co-authored-by: Joao <ramos.joao@protonmail.com>
2020-06-27 14:23:17 -06:00
Marvin Borner dd9d87d25b
Added ddg-style !bang-operators
This is a proof of concept! The code works, but uses hardcoded operators
and may be placed in the wrong file/class.
The best-case scenario would be the possibility to use the 13.000+ ddg
operators, but I don't know if that's possible without having to
redirect to duckduckgo first.
2020-06-26 00:26:02 +02:00
Ben Busby ebfa87f561
Fixed dark mode footer text color
Updated to use config accessor rather than boolean value
2020-06-11 21:13:43 -06:00
Ben Busby b2133edaa3
Session refactoring and improved filter (#86)
* Project refactor (#85)

* Major refactor of requests and session management

- Switches from pycurl to requests library
  - Allows for less janky decoding, especially with non-latin character
  sets
- Adds session level management of user configs
  - Allows for each session to set its own config -- users with blocked cookies fall back to the "default" profile (same usage as before)
- Updates key gen/regen to more aggressively swap out keys after each
request

* Added ability to save/load configs by name

- New PUT method for config allows changing config with specified name
- New methods in js controller to handle loading/saving of configs

* Result formatting and removal of unused elements

- Fixed question section formatting from results page (added appropriate
padding and made questions styled as italic)
- Removed user agent display from main config settings

* Minor change to save config button label (now "Save As...")

* Fixed issue with "de-pickling" of flask session

Having a gitignore-everything ("*") file within a flask session folder seems to cause a
weird bug where the state of the app becomes unusable from continuously
trying to prune files listed in the gitignore (and it can't prune '*').

* Switched to pickling saved configs

* Updated ad/sponsored content filter and conf naming

Configs are now named with a .conf extension to allow for easier manual
cleanup/modification of named config files

Sponsored content now removed by basic string matching of span content

* Version bump to 0.2.0

* Fixed request.send return style

* Moved custom conf files to their own directory

* Refactored whoogle session mgmt

Now allows a fallback "default" session to be used if a user's browser
is blocking cookies

* Reworked pytest client fixture to support new session mgmt

* Added better multilingual support, updated filter

Results page now includes method for switching to "All Languages" from
whichever language is specified as the primary in the config (see #74).

Also removes the non-Whoogle links from the page footer, leaving only
the page navigation controls

Added support for the date range filter on the results page, though I'd
still recommend using the ":past <unit>" query instead.

* Removed no-cache enforcement, minor styling/formatting improvements

* Improving ad filtering for non-English languages

* Added footer to results page
2020-06-11 13:38:51 -06:00
Ben Busby 71ba00785f Quick improvement to ad removal 2020-05-29 13:21:53 -06:00
Ben Busby cb18bc6ccc Updated autocomplete styling
Added dark theme specific stylesheet to use if dark mode is active
2020-05-26 10:58:37 -06:00
Ben Busby 78939e7fb4 Reworked google url routing 2020-05-26 10:47:40 -06:00
Ben Busby 98d639883c Fixing styling/url/safe mode inconsistencies 2020-05-26 10:39:19 -06:00
Ben Busby 9212f9921a Fixed #76
Added enter key submit on results page

Added results type carryover for subsequent searches on results page

Removed redundant header on image search results
2020-05-25 10:53:15 -06:00
Ben Busby d1f38cf924 Fixed styling of footer in dark mode 2020-05-25 10:33:24 -06:00
Ben Busby 21012f5265
Feature: autocomplete/search suggestions (#72)
Basic autocomplete/search suggestion functionality added

* Adds new GET and POST routes for '/autocomplete' that accept a string query and returns an array of suggestions

* Adds new autoscript.js file for handling queries on the main page and results view

* Updated requests class to include autocomplete method

* Updated opensearch template to handle search suggestions

* Added header template to allow for autocomplete on results view

* Updated readme to mention autocomplete feature
2020-05-24 14:03:11 -06:00
Ben Busby 3dbe51e9e7 Removing google's filter card from results 2020-05-24 12:53:21 -06:00
Ben Busby 09c53b52af
Feature: country and safe search config options (#71)
* Added country and safe search config options

* Updated handling of parser error in results test

* Improved handling of default country

* Added 1px empty gif fallback as a replacement for images that fail to load
2020-05-23 14:27:23 -06:00
Ben Busby 699aa4f2e7 Bumped version to 0.1.4 2020-05-22 16:08:47 -06:00
Ben Busby b131f47641 Bumped version to v0.1.3
(forgot to update pip package version)
2020-05-22 10:45:49 -06:00
Ben Busby f1e17d8119
Bumped version to v0.1.2 2020-05-22 10:38:58 -06:00
Ben Busby c51f186419 Added version footer, minor PEP 8 refactoring 2020-05-20 11:02:30 -06:00
Ben Busby 38b7b19e2a
Added basic authentication (#51)
Username/password can be set either as Dockerfile build arguments or
passed into the run script as "--userpass <username:password>"
2020-05-18 10:30:32 -06:00
Paul Rothrock 0e39b8f97b
Added "I'm feeling lucky" function (#46)
* Putting '! ' at the beginning of the query now redirects to the first search result

Signed-off-by: Paul Rothrock <paul@movetoiceland.com>

* Moved get_first_url outside of filter class

Signed-off-by: Paul Rothrock <paul@movetoiceland.com>
2020-05-18 10:28:23 -06:00
Ben Busby a4382d59f6
Updated redirect code used in https redirects
See https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections

301 redirections do not keep the request method intact, and can occasionally be changed from POST to GET

308 redirections always keep the request method, which is necessary for all POST search requests
2020-05-16 09:31:07 -06:00
Ben Busby b4165f9957 Minor improvement to https enforcement 2020-05-15 16:29:22 -06:00
Ben Busby 3123789584
Added config option for opening links in new tab (#49) 2020-05-15 16:10:31 -06:00
Ben Busby 1ed6178e9a
Feature: https only -- adds option to enforce https on running instances (#48)
* Adding HTTPS enforcement

Command line runs of Whoogle Search through pip/pipx/etc will need the
`--https-only` flag appended to the run command.

Docker runs require the `use_https` build arg applied.

* Update README.md

Moved https-only note to top of docker run command, updated pip runner help output

* Dockerfile: removed HTTPS enforcement, updated PORT setting

Dockerfile no longer enforces an HTTPS connection, but still allows for
setting via a build arg. The Flask server port is now configurable as a
build arg as well, by setting a port number to "whoogle_port"

* Fixed incorrect port assignment
2020-05-15 15:44:50 -06:00
Ben Busby afd5b9aa83 Minor fix to dark mode on img results 2020-05-15 14:17:16 -06:00
Ben Busby 87f0a8d496
Added volume mounted config to Dockerfile (#39) 2020-05-13 18:27:04 -06:00
Ben Busby f4bd3df2bb
Added option to search only via GET request (#36)
This addresses #18, which brought up the issue of searching with Whoogle
with the search instance set to always use a specific container in
Firefox Container Tabs.

Could also be useful if you want to share your search results or
something, I guess. Though nobody likes when people do that.
2020-05-13 00:19:51 -06:00
Ben Busby a11ceb0a57
Feature: language config (#27)
* Added language configuration support

Main page now has a dropdown for selecting preferred language of
results.

Refactored config to be its own model with language constants.

* Added more language support

Interface language is now updated using the "hl" arg

Fixed chinese traditional and simplified values

Updated decoding of characters to gb2312

* Updated to use conditional decoding dependent on language

* Updated filter to not rely on valid config to work properly
2020-05-12 17:15:53 -06:00
Jake Howard f700ed88e7
Swap out Flask's default web server for Waitress (#32)
* Ignore venv when building docker file

* Remove reference to 8888 port

It wasn't really used anywhere, and setting it to 5000 everywhere removes ambiguity, and makes things easier to track and reason about

* Use waitress rather than Flask's built in web server

It's not production grade

* Actually add waitress to requirements

Woops!
2020-05-12 17:14:55 -06:00
Ben Busby 445019d204 Fixed RAM usage bug
Pushing straight to master since this is an extremely simple fix, with
a pretty large performance benefit.

The Phyme library used for generating a User Agent rhyme was consuming
an absolute unit of memory. Now that it's removed, it's using about 10x
less memory, at the cost of User Agents being not as funny anymore.
2020-05-12 00:45:56 -06:00
Ben Busby 1798b6094d
Merge pull request #16 from Kombustor/patch-1
Add autofocus to input field
2020-05-10 14:09:42 -06:00
Ben Busby 7ccad2799e Added config option to address instance behind reverse proxy
Config options now allow setting a "root url", which defaults to the
request url root. Saving a new url in this field will allow for proper
redirects and usage of the opensearch element.

Also provides a possible solution for #17, where the default flask redirect method redirects to
http instead of https.
2020-05-10 13:27:02 -06:00
Fabian Schliski 743caf6cc7
Updating autofocus value 2020-05-10 20:17:32 +02:00
Fabian Schliski 0fc5fa9d99
Add autofocus to input field
Supported in all major browsers, allows the user to immediately start typing after loading the page.
2020-05-10 16:51:42 +02:00
Ben Busby 130ac4532e Refactored handling of user config
Now implemented as a flask global variable reads from the same json file
as before, but doesn't crash if it does not find an existing file.

Removed user config creation from run script
2020-05-06 18:39:12 -06:00
Ben Busby d316fd77c6 Updated setup and routes for pipx compatibility 2020-05-06 18:13:02 -06:00
Ben Busby d01f56ea03 Removed referrer from links, refacored routes
Added <meta name="referrer" content="no-referrer"> to all whoogle
templates

Refactored search route to use conditionally use either request.args or
request.form, depending on rest call (get vs post respectively)
2020-05-05 18:28:43 -06:00
Ben Busby 708769f682 Minor styling refactor, updated app name 2020-05-04 18:00:43 -06:00
Ben Busby 0300eab6df Updated formatting and setup instructions
Switched encoding from utf-8 to unicode-escape in an effort to support multiple
languages besides English.

Updated image results page formatting to fix bad image links (added TODO
for adding full res image link for each image result).

Updated README to include libcurl and libssl install instructions for
manual setup.
2020-05-03 19:32:47 -06:00
Ben Busby 39c475af21 Using urlencode "doseq" option for url args 2020-04-29 20:31:03 -06:00
Ben Busby 3e404cb524 Restructured valid params checking, added empty query redirect 2020-04-29 18:53:58 -06:00
Ben Busby c30f21f950 Minor conditional fix in filter 2020-04-29 14:46:00 -06:00
Ben Busby b83f14be26 Fixed image href filter
Needed to be checking against img attrs, not just the img object itself
2020-04-29 11:18:07 -06:00
Ben Busby 6d38abd1b4 Removed debug from opensearch template 2020-04-29 10:12:49 -06:00
Ben Busby dcd93d4869 Fixed filter params, updated search button text 2020-04-29 10:03:34 -06:00
Ben Busby 5fe308956b Cleaned up filter class, updated js config tool 2020-04-29 09:46:18 -06:00
Ben Busby 0a3da5cea4 Updated js controller and config api route
Controller was refactored to be a bit less monolithic.

Config route was updated to accept an html form data POST rather than
just a json object.
2020-04-28 20:50:12 -06:00
Ben Busby 1cbe394e6f Updated tests, fixed a few bugs
Added opensearch routes test and individual tests for searching via GET
and POST separately.

Fixed incorrect assignment in gen_query.
2020-04-28 18:59:33 -06:00
Ben Busby 0c0ebb8917 Added POST search, encrypted query strings, refactoring
The implementation of POST search support comes with a few benefits. The
most apparent is the avoidance of search queries appearing in web server
logs -- instead of the prior GET approach (i.e.
/search?q=my+search+query), using POST requests with the query stored in
the request body creates logs that simply appear as "/search".

Since a lot of relative links are generated in the results page, I came
up with a way to generate a unique key at run time that is used to
encrypt any query strings before sending to the user. This benefits both
regular text queries as well as fetching of image links and means that
web logs will only show an encrypted string where a link or query
string might slip through.

Unfortunately, GET search requests still need to be supported, as it
doesn't seem that Firefox (on iOS) supports loading search engines by
their opensearch.xml file, but instead relies on manual entry of a
search query string. Once this is updated, I'll probably remove GET
request search support.
2020-04-28 18:19:34 -06:00
Ben Busby 4180aedd87 Added image proxying, refactored filter class
Images were previously directly fetched from google search results,
which was a potential privacy hazard. All image sources are now modified
to be passed through shoogle's routing first, which will then fetch raw
image data and pass it through to the user.

Filter class was refactored to split the primary clean method into
smaller, more manageable submethods.
2020-04-27 20:21:36 -06:00
Ben Busby b0e6167733 Improved bad url arg filtering 2020-04-26 18:48:40 -06:00
Ben Busby 3bc58b64be Small update to filter class
The image results page seems to have different formatting from non-image
results pages. Should probably revisit this at some point and try to
style the image results page to be more in line with other result types.
2020-04-25 11:32:43 -06:00
Ben Busby 38c0f56322 Fixed gitignore, added required files 2020-04-24 19:03:22 -06:00
Ben Busby 1f6bfa092e Complete refactoring of opensearch
Refactored opensearch.xml to only exist as a template that is
served by a flask route, which is then populated with the
necessary url root.
2020-04-24 18:45:57 -06:00
Ben Busby 525f7adf22 Merge branch 'master' of github.com:benbusby/shoogle 2020-04-24 17:25:06 -06:00
Ben Busby e21341d6f4 Deployment related refactoring, fixes to Dockerfile
- Updated Dockerfile to include chmod of run script
- Added app.json for Heroku quick deploy
- Removed unused function var in js controller
- Moved requirements back to root of repo
- Added Codebeat report to readme
2020-04-24 17:23:08 -06:00
Ben Busby a7005c012e Refactoring of user requests and routing
Curl requests and user agent related functionality was moved to its own
request class.

Routes was refactored to only include strictly routing related
functionality.

Filter class was cleaned up (had routing/request related logic in here,
which didn't make sense)
2020-04-23 20:59:43 -06:00
Ben Busby 31b9e19af7 Fixed main banner in readme 2020-04-19 15:28:40 -06:00
Ben Busby 67e3c788c7 Updated readme, added screenshots 2020-04-19 15:23:39 -06:00
Ben Busby 6a150092a2 Fixed config bug in filter, updated run script to work on mac os 2020-04-16 18:50:31 -06:00
Ben Busby 2631335dbf Updated README 2020-04-16 18:37:24 -06:00
Ben Busby bd773ec5ff Small update to js config request 2020-04-16 18:12:30 -06:00
Ben Busby e72ccc4988 Small change to mobile styling 2020-04-16 10:10:18 -06:00
Ben Busby 024552f2df Minor refactor of filter class, updated tests, fixed html/css, added ua to config 2020-04-16 10:01:02 -06:00
Ben Busby b5b6e64177 Added testing and ci build, refactored filter class, refactored project structure 2020-04-15 17:41:53 -06:00
Ben Busby 67d8b0d99d Updated favicons 2020-04-12 14:26:32 -06:00
Ben Busby 48b9f66490 Fixed incorrect opensearch template, updated readme 2020-04-11 15:57:06 -06:00
Ben Busby 20fce34db3 Added opensearch setup 2020-04-11 15:24:00 -06:00
Ben Busby ea7ddce7b3 Updated dockerfile and run script to work with heroku deployment 2020-04-11 14:37:15 -06:00
Ben Busby 850a46aea1 Refactored routes, added filter class for returned results, added dockerignore 2020-04-10 14:52:27 -06:00
Ben Busby 08a8a3e064 Fixed missing page title 2020-04-08 19:13:25 +00:00
Ben Busby 5bfc4d9a74 Added user config for nojs links and dark mode, minor styling updates 2020-04-08 12:47:21 -06:00
Ben Busby a00ccb1da8 Small fix for viewing images on mobile, updated document title formatting 2020-04-08 18:11:08 +00:00
Ben Busby 2411f9de8d Fixed bug in nojs config setting, updated pages to use new favicon and proper headers 2020-04-07 14:12:16 -06:00
Ben Busby 5687c87a65 Adding optional nojs links to results page, changed nojs to a user setting 2020-04-07 17:04:03 +00:00
Ben Busby 6a82f6e1ad Added filtering of sponsored content 2020-04-06 18:20:44 +00:00
Ben Busby 066c253c4d Added ability to update config from home page 2020-04-05 17:59:50 -06:00
Ben Busby 9c0b4a7f58 Minor fix for filtering by time range 2020-04-05 16:37:35 -06:00
Ben Busby 254c987254 Added filter by date range, minor aesthetic changes 2020-04-05 16:15:46 -06:00
Ben Busby 9fbaa1d6cf Added run script, updated to use config json file for general location, general restyling 2020-04-04 19:30:53 -06:00
Ben Busby d90468c667 Updated to remove ads, minor renaming refactor 2020-04-03 18:02:45 +00:00
Ben Busby 24aa4367d3 Added optional no-js functionality, added location based searching (hardcoded), updated html 2020-02-21 23:52:29 +00:00
Ben Busby 4636b0f695 Added html parsing to remove returned scripts, added logo 2020-01-23 06:19:17 +00:00
Ben Busby b11fc5fe67 Improved homepage styling 2020-01-22 07:15:29 +00:00
Ben Busby a922b42cbd Added desktop/mobile agent switching, updated gitignore 2020-01-22 05:51:02 +00:00
Ben Busby 1e1bb4a55a Added tbm (images/news/etc) handling, updated front page and search controls 2020-01-21 18:07:08 -07:00
Ben Busby 6e7eef165e Initial commit 2020-01-21 13:26:49 -07:00