Searching through large databases is often a linear time problem. Here I compare the performance of applying a bloom filter and using the regular std::find command in C++:Codes are from: https://codereview.stackexchange.com/questions/179135/bloom-filter-implementation-in-c
Simple example Strings in the database: Hello World! sweet potato C++ alpha beta gamma Number of strings in the database: 6 Number of strings in the test set: 10 Strings to look for: Bloom filter takes (ms):51 Regular search takes takes (ms):39
Then I tested it on Ligity-like hashes in the form of AAA000, where AAA were the three vertex types and 000 were the three edge length indices. I generate a list of random hashes in this configuration and varied the number of prefixes – where the first n characters are the same.
Applying it to the Ligity-like hashes reinforces two ideas: a) having a good hashing function that adequately separate out items are important b) bloom filter requires the prior overhead in encoding the hashes, but this overhead helps save a significant amount of time when a large number of searches were carried out.