A Unix-style personal search engine and web crawler for your digital footprint

# · 🔥 341 · 💬 92 · 2 years ago · github.com · amirGi · 📷

The actual search engine which takes a query, tokenizes and stems it, finds the relevant results from the inverted index using those stemmed tokens then ranks results with TF-IDF. A package which pulls in data from a couple of different sources - if you want to pull data from a custom data source, this is where you should add it. Because since any data gets parsed into this standarized format, you can link any data source you want, if you build your own tool, if you store a lot of data in some existing one, you don't have to manually add everything. You can pull in data from any data source provided you give the API data in this format. Apollo can't handle all types of data, it's not designed to. If you want to index all your Twitter data for example, this is possible since all of the data can be absorbed in a constant format, converted into the compatible apollo format, and sent off. Note since Apollo syncs from some personal data sources, you'll want to remove them, add your own, or build stuff on top of them. Future Improve the search algorithm, more like Elasticsearch when data grows a lot?