Documentation
¶
Overview ¶
Package bloomfilters provides a basic, generic and async implementations of Bloom filters in Go.
Index ¶
- Variables
- type Bits
- func (b *Bits) BitsCount() uint64
- func (b *Bits) Copy() Bits
- func (b *Bits) Equals(other *Bits) bool
- func (b *Bits) GetHash(hash uint64) bool
- func (b *Bits) Getbit(index uint64) bool
- func (b *Bits) MarshalBinary() (data []byte, err error)
- func (b *Bits) MarshalJSON() ([]byte, error)
- func (b *Bits) MarshalText() (text []byte, err error)
- func (b *Bits) SetHash(hash uint64)
- func (b *Bits) Setbit(index uint64)
- func (b *Bits) Size() uint64
- func (b *Bits) UnmarshalBinary(data []byte) error
- func (b *Bits) UnmarshalJSON(d []byte) error
- func (b *Bits) UnmarshalText(text []byte) error
- func (b *Bits) Words() []uint64
- type BloomFilter
- type BloomFilterOptions
- func WithAllHashFunctions() BloomFilterOptions
- func WithAppendHashFunctions(hashFunctions []bloomhashes.HashFunction) BloomFilterOptions
- func WithBits(bits Bits) BloomFilterOptions
- func WithDefaultHashFunctions() BloomFilterOptions
- func WithHashFunctions(hashFunctions []bloomhashes.HashFunction) BloomFilterOptions
- func WithSize(size uint64) BloomFilterOptions
- func WithWords(words []uint64) BloomFilterOptions
- type ConcurrentBloomFilter
- func (bf *ConcurrentBloomFilter) Add(data []byte)
- func (bf *ConcurrentBloomFilter) Bits() Bits
- func (bf *ConcurrentBloomFilter) BitsCount() uint64
- func (bf *ConcurrentBloomFilter) GetHash(hash uint64) bool
- func (bf *ConcurrentBloomFilter) SetHash(hash uint64)
- func (bf *ConcurrentBloomFilter) Test(data []byte) bool
- type GenericBloomFilter
- func (g *GenericBloomFilter[T]) Add(data T)
- func (bf *GenericBloomFilter[T]) Bits() Bits
- func (bf *GenericBloomFilter[T]) BitsCount() uint64
- func (g *GenericBloomFilter[T]) GetHash(hash uint64) bool
- func (g *GenericBloomFilter[T]) SetHash(hash uint64)
- func (g *GenericBloomFilter[T]) Test(data T) bool
- type IBloomFilter
Examples ¶
Constants ¶
This section is empty.
Variables ¶
Functions ¶
This section is empty.
Types ¶
type Bits ¶
type Bits struct {
// contains filtered or unexported fields
}
Bits is a structure that represents a bit array for use in bloom filters. It provides methods to set and get bits, as well as to marshal and unmarshal the data for storage or transmission.
Example ¶
package main
import (
"fmt"
bloomfilters "github.com/daanv2/go-bloom-filters"
)
func main() {
bits := bloomfilters.NewBits(128) // Create a Bits structure with 128 bits
bits.Setbit(5) // Set the bit at index 5
bits.Setbit(10) // Set the bit at index 10
// Check if specific bits are set
fmt.Println(bits.Getbit(5))
fmt.Println(bits.Getbit(10))
fmt.Println(bits.Getbit(0))
fmt.Println(bits.Getbit(127))
}
Output: true true false false
func NewBits ¶
NewBits creates a new Bits structure with the specified size in bits. It calculates the necessary number of uint64 words to accommodate the specified size.
func (*Bits) BitsCount ¶
BitsCount returns the total number of bits that are set to 1 in the bloom filter.
func (*Bits) Equals ¶
Equals returns true if this Bits structure is equal to the other Bits structure.
func (*Bits) GetHash ¶ added in v1.0.2
GetHash checks if the bit at the index corresponding to the given hash value is set to 1.
func (*Bits) Getbit ¶
Getbit returns the value of the bit at the specified index (true if set, false otherwise).
func (*Bits) MarshalBinary ¶
MarshalBinary implements encoding.BinaryMarshaler.
func (*Bits) MarshalJSON ¶
MarshalJSON implements json.Marshaler.
func (*Bits) MarshalText ¶
MarshalText implements encoding.TextMarshaler.
func (*Bits) SetHash ¶ added in v1.0.2
SetHash sets the bit at the index corresponding to the given hash value to 1.
func (*Bits) Setbit ¶
Setbit sets the bit at the specified index to 1. If the index is out of bounds (greater than or equal to the size of the Bits), it will not set any bit.
func (*Bits) UnmarshalBinary ¶
UnmarshalBinary implements encoding.BinaryUnmarshaler.
func (*Bits) UnmarshalJSON ¶
UnmarshalJSON implements json.Unmarshaler.
func (*Bits) UnmarshalText ¶
UnmarshalText implements encoding.TextUnmarshaler.
type BloomFilter ¶
type BloomFilter struct {
// contains filtered or unexported fields
}
BloomFilter is a probabilistic data structure that tests whether an element is a member of a set. False positive matches are possible, but false negatives are not.
Example ¶
Example of basic BloomFilter usage
package main
import (
"fmt"
bloomfilters "github.com/daanv2/go-bloom-filters"
"github.com/daanv2/go-bloom-filters/pkg/bloomhashes"
)
func main() {
// Create a bloom filter with size 1024 and FNV hash function
bf, _ := bloomfilters.NewBloomFilter(
bloomfilters.WithSize(1024),
bloomfilters.WithHashFunctions([]bloomhashes.HashFunction{bloomhashes.Fnv1_64}),
)
// Add some data
bf.Add([]byte("apple"))
bf.Add([]byte("banana"))
bf.Add([]byte("orange"))
// Test if data is in the filter
fmt.Println(bf.Test([]byte("apple")))
fmt.Println(bf.Test([]byte("banana")))
fmt.Println(bf.Test([]byte("grape")))
}
Output: true true false
Example (MultipleHashes) ¶
Example demonstrating bloom filter with multiple hash functions
package main
import (
"fmt"
bloomfilters "github.com/daanv2/go-bloom-filters"
"github.com/daanv2/go-bloom-filters/pkg/bloomhashes"
)
func main() {
// Create a bloom filter with multiple hash functions for better accuracy
bf, _ := bloomfilters.NewBloomFilter(
bloomfilters.WithSize(2048),
bloomfilters.WithHashFunctions([]bloomhashes.HashFunction{
bloomhashes.Fnv1_64,
bloomhashes.Fnv1_64a,
bloomhashes.MD5,
}),
)
// Add email addresses
bf.Add([]byte("user@example.com"))
bf.Add([]byte("admin@example.com"))
// Check if emails exist
fmt.Println(bf.Test([]byte("user@example.com")))
fmt.Println(bf.Test([]byte("admin@example.com")))
fmt.Println(bf.Test([]byte("guest@example.com")))
}
Output: true true false
func NewBloomFilter ¶
func NewBloomFilter(opts ...BloomFilterOptions) (*BloomFilter, error)
NewBloomFilter creates a new bloom filter with the given options. It returns an error if the size of the bloom filter is invalid or if any of the hash functions are nil.
func (*BloomFilter) Add ¶
func (bf *BloomFilter) Add(data []byte)
Add adds the given data to the bloom filter by applying each hash function to the data and setting the corresponding bits in the filter.
func (*BloomFilter) Bits ¶
func (bf *BloomFilter) Bits() Bits
Bits returns a copy of the Bits struct representing the bit array of the bloom filter. Modifying the returned Bits will not affect the internal state of the bloom filter.
func (*BloomFilter) BitsCount ¶
func (bf *BloomFilter) BitsCount() uint64
BitsCount returns the total number of bits that are set to 1 in the bloom filter.
func (*BloomFilter) GetHash ¶
func (bf *BloomFilter) GetHash(hash uint64) bool
Get checks if the bit at the index corresponding to the given hash value is set to 1.
func (*BloomFilter) SetHash ¶
func (bf *BloomFilter) SetHash(hash uint64)
Set sets the bit at the index corresponding to the given hash value to 1.
func (*BloomFilter) Test ¶
func (bf *BloomFilter) Test(data []byte) bool
Test checks if the given data is likely to be in the bloom filter by applying each hash function to the data and checking if the corresponding bits in the filter are set. It returns true if all bits are set, indicating that the data is likely to be in the filter, and false otherwise.
func (*BloomFilter) Words ¶
func (bf *BloomFilter) Words() []uint64
Words returns a copy slice of uint64 words representing the bits in the bloom filter.
type BloomFilterOptions ¶
type BloomFilterOptions interface {
// contains filtered or unexported methods
}
func WithAllHashFunctions ¶
func WithAllHashFunctions() BloomFilterOptions
WithAllHashFunctions sets all available hash functions to be used by the bloom filter. It retrieves all hash functions from the bloomhashes package and assigns them to the bloom filter's hashes field.
func WithAppendHashFunctions ¶
func WithAppendHashFunctions(hashFunctions []bloomhashes.HashFunction) BloomFilterOptions
WithAppendHashFunctions appends additional hash functions to the existing list of hash functions used by the bloom filter. It takes a slice of HashFunction and appends it to the current list of hash functions.
func WithBits ¶
func WithBits(bits Bits) BloomFilterOptions
WithBits sets the bits of the bloom filter using a Bits structure. It directly assigns the provided Bits to the bloom filter, allowing for more flexible manipulation of the bit array.
func WithDefaultHashFunctions ¶
func WithDefaultHashFunctions() BloomFilterOptions
WithDefaultHashFunctions sets the default hash functions to be used by the bloom filter. It retrieves the default hash functions from the bloomhashes package and assigns them to the bloom filter's hashes field.
func WithHashFunctions ¶
func WithHashFunctions(hashFunctions []bloomhashes.HashFunction) BloomFilterOptions
WithHashFunctions sets the hash functions to be used by the bloom filter. It replaces any existing hash functions with the provided slice of HashFunction.
func WithSize ¶
func WithSize(size uint64) BloomFilterOptions
WithSize sets the size of the bloom filter in bits. It calculates the necessary number of uint64 words to accommodate the specified size and initializes the Bits structure accordingly.
func WithWords ¶
func WithWords(words []uint64) BloomFilterOptions
WithWords sets the bits of the bloom filter using a slice of uint64 words. It initializes the Bits structure with the provided words, allowing for direct manipulation of the bloom filter's bit array.
type ConcurrentBloomFilter ¶
type ConcurrentBloomFilter struct {
// contains filtered or unexported fields
}
ConcurrentBloomFilter is a thread-safe bloom filter that uses a spinlock for concurrent access. It is safe to call Add and Test methods from multiple goroutines.
Example ¶
Example of basic ConcurrentBloomFilter usage
package main
import (
"fmt"
bloomfilters "github.com/daanv2/go-bloom-filters"
"github.com/daanv2/go-bloom-filters/pkg/bloomhashes"
)
func main() {
// Create a bloom filter with size 1024 and FNV hash function
bf, _ := bloomfilters.NewConcurrentBloomFilter(
bloomfilters.WithSize(1024),
bloomfilters.WithHashFunctions([]bloomhashes.HashFunction{bloomhashes.Fnv1_64}),
)
// Add some data
bf.Add([]byte("apple"))
bf.Add([]byte("banana"))
bf.Add([]byte("orange"))
// Test if data is in the filter
fmt.Println(bf.Test([]byte("apple")))
fmt.Println(bf.Test([]byte("banana")))
fmt.Println(bf.Test([]byte("grape")))
}
Output: true true false
Example (MultipleHashes) ¶
Example demonstrating concurrent bloom filter with multiple hash functions
package main
import (
"fmt"
bloomfilters "github.com/daanv2/go-bloom-filters"
"github.com/daanv2/go-bloom-filters/pkg/bloomhashes"
)
func main() {
// Create a bloom filter with multiple hash functions for better accuracy
bf, _ := bloomfilters.NewConcurrentBloomFilter(
bloomfilters.WithSize(2048),
bloomfilters.WithHashFunctions([]bloomhashes.HashFunction{
bloomhashes.Fnv1_64,
bloomhashes.Fnv1_64a,
bloomhashes.MD5,
}),
)
// Add email addresses
bf.Add([]byte("user@example.com"))
bf.Add([]byte("admin@example.com"))
// Check if emails exist
fmt.Println(bf.Test([]byte("user@example.com")))
fmt.Println(bf.Test([]byte("admin@example.com")))
fmt.Println(bf.Test([]byte("guest@example.com")))
}
Output: true true false
func NewConcurrentBloomFilter ¶
func NewConcurrentBloomFilter(opts ...BloomFilterOptions) (*ConcurrentBloomFilter, error)
NewConcurrentBloomFilter creates a new concurrent bloom filter with the given options. It returns an error if the size of the bloom filter is invalid or if any of the hash functions are nil.
func (*ConcurrentBloomFilter) Add ¶
func (bf *ConcurrentBloomFilter) Add(data []byte)
Add adds the given data to the bloom filter by applying each hash function to the data and setting the corresponding bits in the filter. This method is thread-safe.
func (*ConcurrentBloomFilter) Bits ¶
func (bf *ConcurrentBloomFilter) Bits() Bits
Bits returns a copy of the Bits struct representing the bit array of the bloom filter. Modifying the returned Bits will not affect the internal state of the bloom filter.
func (*ConcurrentBloomFilter) BitsCount ¶
func (bf *ConcurrentBloomFilter) BitsCount() uint64
BitsCount returns the total number of bits that are set to 1 in the bloom filter.
func (*ConcurrentBloomFilter) GetHash ¶
func (bf *ConcurrentBloomFilter) GetHash(hash uint64) bool
GetHash checks if the bit at the index corresponding to the given hash value is set to 1. This method is thread-safe.
func (*ConcurrentBloomFilter) SetHash ¶
func (bf *ConcurrentBloomFilter) SetHash(hash uint64)
SetHash sets the bit at the index corresponding to the given hash value to 1. This method is thread-safe.
func (*ConcurrentBloomFilter) Test ¶
func (bf *ConcurrentBloomFilter) Test(data []byte) bool
Test checks if the given data is likely to be in the bloom filter by applying each hash function to the data and checking if the corresponding bits in the filter are set. This method is thread-safe. It returns true if all bits are set, indicating that the data is likely to be in the filter, and false otherwise.
type GenericBloomFilter ¶
type GenericBloomFilter[T any] struct { // contains filtered or unexported fields }
GenericBloomFilter is a wrapper around IBloomFilter that allows using custom types with a serializer function. It converts custom types to byte slices before passing them to the underlying bloom filter.
func NewGenericBloomFilter ¶
func NewGenericBloomFilter[T any](base IBloomFilter, getBytes func(T) []byte) *GenericBloomFilter[T]
NewGenericBloomFilter creates a new generic bloom filter that wraps the provided bloom filter and uses the given serializer function. The serializer function converts custom types to byte slices for hashing.
func (*GenericBloomFilter[T]) Add ¶
func (g *GenericBloomFilter[T]) Add(data T)
Add implements IBloomFilter.
func (*GenericBloomFilter[T]) Bits ¶
func (bf *GenericBloomFilter[T]) Bits() Bits
Bits returns a copy of the Bits struct representing the bit array of the bloom filter. Modifying the returned Bits will not affect the internal state of the bloom filter.
func (*GenericBloomFilter[T]) BitsCount ¶
func (bf *GenericBloomFilter[T]) BitsCount() uint64
BitsCount returns the total number of bits that are set to 1 in the bloom filter.
func (*GenericBloomFilter[T]) GetHash ¶
func (g *GenericBloomFilter[T]) GetHash(hash uint64) bool
GetHash implements IBloomFilter.
func (*GenericBloomFilter[T]) SetHash ¶
func (g *GenericBloomFilter[T]) SetHash(hash uint64)
SetHash implements IBloomFilter.
func (*GenericBloomFilter[T]) Test ¶
func (g *GenericBloomFilter[T]) Test(data T) bool
Test implements IBloomFilter.
type IBloomFilter ¶
type IBloomFilter interface {
SetHash(hash uint64) // SetHash sets the bit corresponding to the given hash value in the bloom filter.
GetHash(hash uint64) bool // GetHash checks if the bit corresponding to the given hash value is set in the bloom filter.
Add(data []byte) // Add adds the given data to the bloom filter by applying the hash functions and setting the corresponding bits.
Test(data []byte) bool // Test checks if the given data is likely to be in the bloom filter by applying the hash functions and checking the corresponding bits.
BitsCount() uint64 // BitsCount returns the total number of bits that are set to 1 in the bloom filter.
Bits() Bits // Bits returns the underlying bit array of the bloom filter.
}
IBloomFilter defines the interface for bloom filter implementations. It provides methods for adding elements, testing membership, and managing hash values.