Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New API #7

Merged
merged 29 commits into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,2 @@
.stack-work
.cabal-sandbox
cabal.sandbox.config
dist
.stack-work/
*~
29 changes: 0 additions & 29 deletions .travis-setup.sh

This file was deleted.

35 changes: 0 additions & 35 deletions .travis.yml

This file was deleted.

2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
BSD 3-Clause License

Copyright (c) 2017-2022, Heikki Johannes Hildén
Copyright (c) 2017-present, Heikki Johannes Hildén
All rights reserved.

Redistribution and use in source and binary forms, with or without
Expand Down
73 changes: 26 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,60 +1,39 @@
# fuzzyset [![Haskell CI](https://github.com/laserpants/fuzzyset-haskell/actions/workflows/haskell.yml/badge.svg)](https://github.com/laserpants/fuzzyset-haskell/actions/workflows/haskell.yml) [![License](https://img.shields.io/badge/license-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause) [![Language](https://img.shields.io/badge/language-Haskell-yellow.svg)](https://www.haskell.org/) [![Hackage](https://img.shields.io/hackage/v/fuzzyset.svg)](http://hackage.haskell.org/package/fuzzyset)
# fuzzyset-haskell

A fuzzy string set data structure for approximate string matching.
[![License](https://img.shields.io/badge/license-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Language](https://img.shields.io/badge/language-Haskell-yellow.svg)](https://www.haskell.org/)
[![Hackage](https://img.shields.io/hackage/v/fuzzyset.svg)](http://hackage.haskell.org/package/fuzzyset)

## Examples
A fuzzy string set data structure for approximate string matching.

```haskell
{-# LANGUAGE OverloadedStrings #-}
module Main where

import Data.FuzzySet

states = [ "Alabama" , "Alaska" , "American Samoa" , "Arizona" , "Arkansas"
, "California" , "Colorado" , "Connecticut" , "Delaware" , "District of Columbia"
, "Florida" , "Georgia" , "Guam" , "Hawaii" , "Idaho"
, "Illinois" , "Indiana" , "Iowa" , "Kansas" , "Kentucky"
, "Louisiana" , "Maine" , "Maryland" , "Massachusetts" , "Michigan"
, "Minnesota" , "Mississippi" , "Missouri" , "Montana" , "Nebraska"
, "Nevada" , "New Hampshire" , "New Jersey" , "New Mexico" , "New York"
, "North Carolina" , "North Dakota" , "Northern Marianas Islands" , "Ohio" , "Oklahoma"
, "Oregon" , "Pennsylvania" , "Puerto Rico" , "Rhode Island" , "South Carolina"
, "South Dakota" , "Tennessee" , "Texas" , "Utah" , "Vermont"
, "Virginia" , "Virgin Islands" , "Washington" , "West Virginia" , "Wisconsin"
, "Wyoming" ]

statesSet = fromList states

main = mapM_ print (get statesSet "Burger Islands")
```
In a nutshell:

The output of this program is:
1. Add data to the set (see `add`, `add_`, `addMany`, and `addMany_`)
2. Query the set (see `find`, `findMin`, `findOne`, `findOneMin`, `closestMatchMin`, and `closestMatch`)

```haskell
(0.7142857142857143,"Virgin Islands")
(0.5714285714285714,"Rhode Island")
(0.44,"Northern Marianas Islands")
(0.35714285714285715,"Maryland")
```
Refer to the [Haddock docs](http://hackage.haskell.org/package/fuzzyset) for details.

Using the same definition of `statesSet` from previous example:
## Example

```haskell
>>> get statesSet "Why-oh-me-ing"
[(0.5384615384615384,"Wyoming")]

>>> get statesSet "Connect a cat"
[(0.6923076923076923,"Connecticut")]
{-# LANGUAGE OverloadedStrings #-}
module Main where ```

>>> get statesSet "Transylvania"
[(0.75,"Pennsylvania"),(0.3333333333333333,"California"),(0.3333333333333333,"Arkansas"),(0.3333333333333333,"Kansas")]
import Control.Monad.Trans.Class (lift)
import Data.Text (Text)
import Data.FuzzySet (FuzzySearchT, add_, closestMatch, runDefaultFuzzySearchT)

>>> get statesSet "CanOfSauce"
[(0.4,"Kansas")]
findMovie :: Text -> FuzzySearchT IO (Maybe Text)
findMovie = closestMatch

>>> get statesSet "Alaska"
[(1.0,"Alaska")]
prog :: FuzzySearchT IO ()
prog = do
add_ "Jurassic Park"
add_ "Terminator"
add_ "The Matrix"
result <- findMovie "The Percolator"
lift (print result)

>>> get statesSet "Alaskanbraskansas"
[(0.47058823529411764,"Arkansas"),(0.35294117647058826,"Kansas"),(0.35294117647058826,"Alaska"),(0.35294117647058826,"Alabama"),(0.35294117647058826,"Nebraska")]
main :: IO ()
main = runDefaultFuzzySearchT prog
```
16 changes: 16 additions & 0 deletions fourmolu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Generated from https://fourmolu.github.io/config/
indentation: 2
column-limit: none
function-arrows: leading
comma-style: leading
import-export-style: leading
indent-wheres: true
record-brace-space: false
newlines-between-decls: 1
haddock-style: multi-line
haddock-style-module: null
let-style: auto
in-style: right-align
single-constraint-parens: always
unicode: never
respectful: false
59 changes: 36 additions & 23 deletions fuzzyset.cabal
Original file line number Diff line number Diff line change
@@ -1,41 +1,51 @@
cabal-version: 1.12
cabal-version: 2.2

-- This file has been generated from package.yaml by hpack version 0.35.2.
-- This file has been generated from package.yaml by hpack version 0.35.1.
--
-- see: https://github.com/sol/hpack

name: fuzzyset
version: 0.2.4
synopsis: Fuzzy set for approximate string matching
description: This library is based on the Python and JavaScript libraries with similar names.
version: 0.3.0
synopsis: Fuzzy set data structure for approximate string matching
description: Please see the README on GitHub at <https://github.com/laserpants/fuzzyset-haskell#readme>
category: Data
homepage: https://github.com/laserpants/fuzzyset-haskell
author: Johannes Hildén
homepage: https://github.com/laserpants/fuzzyset-haskell#readme
bug-reports: https://github.com/laserpants/fuzzyset-haskell/issues
author: Heikki Johannes Hildén
maintainer: hildenjohannes@gmail.com
copyright: 2017-2023 Johannes Hildén
license: BSD3
copyright: 2017-present laserpants
license: BSD-3-Clause
license-file: LICENSE
build-type: Simple
extra-source-files:
README.md

source-repository head
type: git
location: https://github.com/laserpants/fuzzyset-haskell

library
exposed-modules:
Data.FuzzySet
Data.FuzzySet.Internal
Data.FuzzySet.Types
Data.FuzzySet.Util
Data.FuzzySet.Monad
Data.FuzzySet.Simple
Data.FuzzySet.Utils
other-modules:
Paths_fuzzyset
autogen-modules:
Paths_fuzzyset
hs-source-dirs:
src
ghc-options: -Wall -Wcompat -Widentities -Wincomplete-record-updates -Wincomplete-uni-patterns -Wmissing-export-lists -Wmissing-home-modules -Wpartial-fields -Wredundant-constraints
build-depends:
base >=4.7 && <5
, data-default >=0.7.1.1 && <0.8
, text >=1.2.3.1 && <2.1
, text-metrics >=0.3.0 && <0.4
, unordered-containers >=0.2.10.0 && <0.3
, vector >=0.12.0.3 && <0.14
, mtl >=2.2.2 && <2.4.0
, text >=2.0.2 && <2.1.0
, text-metrics >=0.3.2 && <0.4.0
, transformers >=0.5.6.2 && <0.7.0.0
, unordered-containers >=0.2.19.1 && <0.3.0.0
, vector >=0.13.0.0 && <0.14.0.0
default-language: Haskell2010

test-suite fuzzyset-test
Expand All @@ -44,17 +54,20 @@ test-suite fuzzyset-test
other-modules:
Helpers
Paths_fuzzyset
autogen-modules:
Paths_fuzzyset
hs-source-dirs:
test
ghc-options: -threaded -rtsopts -with-rtsopts=-N
ghc-options: -Wall -Wcompat -Widentities -Wincomplete-record-updates -Wincomplete-uni-patterns -Wmissing-export-lists -Wmissing-home-modules -Wpartial-fields -Wredundant-constraints -threaded -rtsopts -with-rtsopts=-N
build-depends:
base >=4.7 && <5
, data-default >=0.7.1.1 && <0.8
, fuzzyset
, hspec >=2.7.1 && <2.11
, hspec >=2.10.10 && <2.12
, ieee754 >=0.8.0 && <0.9
, text >=1.2.3.1 && <2.1
, text-metrics >=0.3.0 && <0.4
, unordered-containers >=0.2.10.0 && <0.3
, vector >=0.12.0.3 && <0.14
, mtl >=2.2.2 && <2.4.0
, text >=2.0.2 && <2.1.0
, text-metrics >=0.3.2 && <0.4.0
, transformers >=0.5.6.2 && <0.7.0.0
, unordered-containers >=0.2.19.1 && <0.3.0.0
, vector >=0.13.0.0 && <0.14.0.0
default-language: Haskell2010
44 changes: 29 additions & 15 deletions package.yaml
Original file line number Diff line number Diff line change
@@ -1,24 +1,38 @@
name: fuzzyset
version: 0.2.4
synopsis: Fuzzy set for approximate string matching
description: This library is based on the Python and JavaScript libraries with similar names.
homepage: https://github.com/laserpants/fuzzyset-haskell
license: BSD3
version: 0.3.0
synopsis: Fuzzy set data structure for approximate string matching
github: laserpants/fuzzyset-haskell
license: BSD-3-Clause
license-file: LICENSE
author: Johannes Hildén
maintainer: hildenjohannes@gmail.com
copyright: 2017-2023 Johannes Hildén
author: "Heikki Johannes Hildén"
maintainer: "hildenjohannes@gmail.com"
copyright: 2017-present laserpants
category: Data

extra-source-files:
- README.md

description: Please see the README on GitHub at <https://github.com/laserpants/fuzzyset-haskell#readme>

dependencies:
- base >= 4.7 && < 5
- unordered-containers >= 0.2.10.0 && < 0.3
- vector >= 0.12.0.3 && < 0.14
- text >= 1.2.3.1 && < 2.1
- text-metrics >= 0.3.0 && < 0.4
- data-default >= 0.7.1.1 && < 0.8
- base >= 4.7 && < 5
- mtl >= 2.2.2 && < 2.4.0
- text >= 2.0.2 && < 2.1.0
- text-metrics >= 0.3.2 && < 0.4.0
- transformers >= 0.5.6.2 && < 0.7.0.0
- unordered-containers >= 0.2.19.1 && < 0.3.0.0
- vector >= 0.13.0.0 && < 0.14.0.0

ghc-options:
- -Wall
- -Wcompat
- -Widentities
- -Wincomplete-record-updates
- -Wincomplete-uni-patterns
- -Wmissing-export-lists
- -Wmissing-home-modules
- -Wpartial-fields
- -Wredundant-constraints

library:
source-dirs: src
Expand All @@ -33,5 +47,5 @@ tests:
- -with-rtsopts=-N
dependencies:
- fuzzyset
- hspec >= 2.7.1 && < 2.11
- hspec >= 2.10.10 && < 2.12
- ieee754 >= 0.8.0 && < 0.9
Loading