Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Various Distance Metric #107

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
4900b40
Resolved Issue #64
Modernbeast02 Feb 7, 2023
341ebc0
Fixed All Operators Issue#64
Modernbeast02 Feb 7, 2023
34f6cc2
Merge branch 'PEC-CSS:main' into main
Modernbeast02 Feb 7, 2023
0c12221
Taken Pull and Merged
Modernbeast02 Feb 7, 2023
d6d077b
Merge branch 'main' of https://github.com/Modernbeast02/PWOC_slowmokit
Modernbeast02 Feb 7, 2023
16031e4
Resolved Issue#81
Modernbeast02 Feb 7, 2023
3fb7402
Revert "Resolved Issue#81"
Modernbeast02 Feb 7, 2023
019f9f2
Resolved Issue #81
Modernbeast02 Feb 7, 2023
e955eee
Merge branch 'main' into Issue
Ishwarendra Feb 8, 2023
9718f25
conflict resolved
Modernbeast02 Feb 8, 2023
c2d485e
conflict resolved
Modernbeast02 Feb 8, 2023
3512f9b
F1 Score #72
Modernbeast02 Feb 10, 2023
5d0550e
Resolved Conflict in matrix.cpp
Modernbeast02 Feb 10, 2023
93d1b7a
Resolved #72
Modernbeast02 Feb 10, 2023
0d73928
Formatted Code
Modernbeast02 Feb 10, 2023
c3fb7c9
f1Score.md
Modernbeast02 Feb 10, 2023
004c5c2
Update f1Score.md
Modernbeast02 Feb 11, 2023
8d19454
Update f1Score.md
Modernbeast02 Feb 11, 2023
2e4d110
Implemented Distance Metric
Modernbeast02 Feb 14, 2023
a55d7cb
Distance Metric
Modernbeast02 Feb 14, 2023
4e01ffb
test
Modernbeast02 Feb 14, 2023
6601058
test
Modernbeast02 Feb 14, 2023
cf7e2c6
Changes in DistanceMetric
Modernbeast02 Feb 14, 2023
313e84e
Changes Part 2, hopefully last
Modernbeast02 Feb 14, 2023
fbbeb70
Distance_metric.md
Modernbeast02 Feb 15, 2023
56ead03
Hope Never Dies
Modernbeast02 Feb 15, 2023
77e2b07
Merge branch 'main' of https://github.com/Modernbeast02/PWOC_slowmokit
Modernbeast02 Feb 15, 2023
deb0d05
Hope is Dead
Modernbeast02 Feb 15, 2023
87d1367
Resolved Conflicts
Modernbeast02 Feb 18, 2023
5c3a18e
Formatted
Modernbeast02 Feb 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,6 @@ add_library(slowmokit
src/slowmokit/methods/metrics/f1score.hpp
src/slowmokit/methods/metrics/f1score.cpp
src/slowmokit/methods/metrics/mean_squared_error.hpp
src/slowmokit/methods/metrics/mean_squared_error.cpp)
src/slowmokit/methods/metrics/mean_squared_error.cpp
src/slowmokit/methods/metrics/distance_metric/distance_metric.cpp
src/slowmokit/methods/metrics/distance_metric/distance_metric.hpp)
47 changes: 47 additions & 0 deletions docs/methods/metrics/distance_metric.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Distance Metric

## Euclidean Distance
Euclidean Distance represents the shortest distance between two points.

## Manhattan Distance
Manhattan Distance is the sum of absolute differences between points across all the dimensions.

## Manhattan Distance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change distance name to minkowski

Minkowski Distance is the generalized form of Euclidean and Manhattan Distance.

## Cosine Similarity
Cosine similarity is a metric, helpful in determining, how similar the data objects are irrespective of their size.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add sentence like

lesser is the cosine similarity, more similar are two points.

for all four metric.


## Parameters

| Name | Definition | Type |
| -----------------| --------------------------------------------------------------------------------------- | ---- |
| x | A vector of values | `T`|
| y | A vector of values | `T`|


## Methods

| Name | Definition | Return value |
| ------------------------------- | ----------------------------------------------------- | ----------------- |
| `euclidean()` | To find the euclidean distance | `double` |
| `manhattan()` | To find the manhattan distance | `int, double` |
| `minkowski(int p)` | To find the minkowski distance | `double` |
| `magnitude(vector<T> &x)` | To find the magnitude of the vector | `double` |
| `cosineSimilarity()` | To find the cosine similarity | `double` |



## Example

```cpp
std::vector<double> dist1 = {1, 4, 4, 4};
std::vector<double> dist2 = {1, 2, 3, 4};
DistanceMetric Dist(dist1, dist2);
std::cout << "Minkowski Distance is " << Dist.minkowski(3) << std::endl;
std::cout << "Euclidean Distance is " << Dist.euclidean() << std::endl;
std::cout << "Manhattan Distance is " << Dist.manhattan() << std::endl;
std::cout << "Cosine Similarity is " << Dist.cosineSimilarity() << std::endl;
```


1 change: 0 additions & 1 deletion docs/methods/metrics/f1Score.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Some advantages of F1-score:
2)If you choose your positive class as the one with fewer samples, F1-score can help balance the metric across positive/negative samples.



## Parameters

| Name | Definition | Type |
Expand Down
14 changes: 14 additions & 0 deletions examples/metrics/distance_metric_eg.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// #include "../src/slowmokit/methods/metrics/distance_metric/distance_metric.hpp"


// int main()
// {
// std::vector<double> dist1 = {1, 4, 4, 4};
// std::vector<double> dist2 = {1, 2, 3, 4};
// DistanceMetric Dist(dist1, dist2);
// std::cout << "Minkowski Distance is " << Dist.minkowski(3) << std::endl;
// std::cout << "Euclidean Distance is " << Dist.euclidean() << std::endl;
// std::cout << "Manhattan Distance is " << Dist.manhattan() << std::endl;
// std::cout << "Cosine Similarity is " << Dist.cosineSimilarity() << std::endl;
// return 0;
// }
72 changes: 72 additions & 0 deletions src/slowmokit/methods/metrics/distance_metric/distance_metric.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
/**
* @file methods/metrics/distance_metric/distance_metric.hpp
*
* Easy include to calculate distance metrics
*/

#include "distance_metric.hpp"

template<class T>
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
DistanceMetric<T>::DistanceMetric(const std::vector<T> &x, const std::vector<T> &y)

We don't want user to feel insecure about their data. Use const. (No need to update in .hpp file)

const in cpp but not in hpp?

.hpp file

void f(int);

.cpp file

void f(const int param1) {
   return;
}

is valid and recommended.

{
this->x = x;
this->y = y;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
{
this->x = x;
this->y = y;
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y) : x(x), y(y)
{

Use initialiser list, they are faster.

if (x.size() != y.size())
{
throw std::domain_error("Size of the two vectors must be same");
}
}

template<class T> double DistanceMetric<T>::euclidean()
{
double distance = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double distance = 0;
long double distance = 0;

You will have to downcast long double to double finally. use std::dynamic_cast

UPDATE: Do this wherever I have added double -> long double comments

int n = x.size();
for (int i = 0; i < n; i++)
{
distance += (x[i] - y[i]) * (x[i] - y[i]);
}
return std::sqrt(distance);
}

template<class T> T DistanceMetric<T>::manhattan()
{
T distance = 0;
int n = x.size();
for (int i = 0; i < n; i++)
{
distance += std::abs(x[i] - y[i]);
}
return distance;
}
template<class T> double DistanceMetric<T>::minkowski(int power)
{
double distance = 0;
int n = x.size();
for (int i = 0; i < n; i++)
{
distance += std::pow(x[i] - y[i], power);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::abs(x[i] - y[i]) instead.
Otherwise there might be a problem in case of odd values of p.

}
return std::pow(distance, 1.0 / power);
}

template<class T> double DistanceMetric<T>::magnitude(std::vector<T> &x)
{
double result = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double result = 0;
long double result = 0;

double might overflow

int n = x.size();
for (int i = 0; i < n; i++)
{
result += x[i] * x[i];
}
return std::sqrt(result);
}

template<class T> double DistanceMetric<T>::cosineSimilarity()
{
double dotProduct = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double dotProduct = 0;
long double dotProduct = 0;

int n = x.size();
for (int i = 0; i < n; i++)
{
dotProduct += x[i] * y[i];
}
return dotProduct / (magnitude(x) * magnitude(y));
}
58 changes: 58 additions & 0 deletions src/slowmokit/methods/metrics/distance_metric/distance_metric.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/**
* @file methods/metrics/distance_metric/distance_metric.hpp
*
* Easy include to calculate distances
*/

#ifndef SLOWMOKIT_DISTANCE_METRIC_HPP
#define SLOWMOKIT_DISTANCE_METRIC_HPP
#include "../../../core.hpp"

/**
* Takes predicted and actual values of classes
* @param x
* @param y
Comment on lines +13 to +14
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add short statement describing x and y

* @returns the distance metrics
* @throws domain_error exception when size of the two vectors is not equal
*/
template<class T> class DistanceMetric
{
private:
std::vector<T> x;
std::vector<T> y;

public:
DistanceMetric(std::vector<T> &x, std::vector<T> &y);

/**
* @returns euclidean distance between the two vectors
*/
double euclidean();


/**
* @returns manhattan distance between the two vectors
*/
T manhattan();


/**
* @param power The order of the norm
* @returns minkowski distance between the two vectors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @returns minkowski distance between the two vectors
* @returns minkowski distance between the two vectors
* @throws ...

power < 1 is not allowed..

*/
double minkowski(int);

/**
* @brief to find the magnitude of the vector
* @param x a vector
* @returns magnitude of x
*/
double magnitude(std::vector<T> &);
Ishwarendra marked this conversation as resolved.
Show resolved Hide resolved

/**
* @returns cosine similarity between the two vectors
*/
double cosineSimilarity();
};

#endif // SLOWMOKIT_DISTANCE_METRIC_HPP