Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept: Suggest similar person to rename them #262

Draft
wants to merge 16 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions lib/BackgroundJob/Tasks/CreateClustersTask.php
Original file line number Diff line number Diff line change
Expand Up @@ -196,12 +196,14 @@ private function createClusterIfNeeded(string $userId) {
$faces = $this->faceMapper->getFaces($userId, $modelId);
$this->logInfo(count($faces) . ' faces found for clustering');

$relations = $this->relationMapper->findByUserAsMatrix($userId, $modelId);

// Cluster is associative array where key is person ID.
// Value is array of face IDs. For old clusters, person IDs are some existing person IDs,
// and for new clusters is whatever chinese whispers decides to identify them.
//
$currentClusters = $this->getCurrentClusters($faces);
$newClusters = $this->getNewClusters($faces);
$newClusters = $this->getNewClusters($faces, $relations);
$this->logInfo(count($newClusters) . ' persons found after clustering');

// New merge
Expand Down Expand Up @@ -237,7 +239,7 @@ private function getCurrentClusters(array $faces): array {
return $chineseClusters;
}

private function getNewClusters(array $faces): array {
private function getNewClusters(array $faces, array $relations): array {
// Create edges for chinese whispers
$sensitivity = $this->settingsService->getSensitivity();
$min_confidence = $this->settingsService->getMinimumConfidence();
Expand All @@ -252,6 +254,15 @@ private function getNewClusters(array $faces): array {
}
for ($j = $i, $face_count2 = count($faces); $j < $face_count2; $j++) {
$face2 = $faces[$j];
if ($this->relationMapper->existsOnMatrix($face1->id, $face2->id, $relations)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOooh, I liiiike this idea!:)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, now that I thinj... this will affect cluster creation. So, we might end up with those faces in same bucket. Which is OK, but next execution will "split" those faces into two cluster again. And now user will have to again accept merges.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, consider to keep those two system ("automatic cluster creation" and "manual approvals" decouples, as coupled they can lead to unintended feedback loops

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOooh, I liiiike this idea!:)

I guess now you understand why I insisted on implementing the relations tables with faces, and not with persons. 😅

On the other hand, now that I thinj... this will affect cluster creation. So, we might end up with those faces in same bucket. Which is OK, but next execution will "split" those faces into two cluster again. And now user will have to again accept merges.

In what circumstances do you suppose it will be splited again?
At this point, we only add or avoid adding new edges and they will be processed just like always by chinese_whispers. These will be repeated between all executions, therefore it will be as stable as until today.

When joining multiple clusters, can change the main face in the resulting cluster, but since we talk about face relations, these are still valid and we use them again in all the executions beyond the resulting clusters.

In any case, when the main faces change, new face relations will be created. In the example in my last comment, 44 clusters were merged, and 4 just new relations were created.
I am concerned that this table grows a lot, but it seems that it is little, and in any case, we can eliminate the faces that were not accepted before adding the new ones.

So, consider to keep those two system ("automatic cluster creation" and "manual approvals" decouples, as coupled they can lead to unintended feedback loops

I don't quite understand what you mean, but maybe the previous comments will clarify it.

The only worrying point is that we need something like undo, in case the user accepts any wrong face by mistake.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So, let's see how this plays out!:)

$state = $this->relationMapper->getStateOnMatrix($face1->id, $face2->id, $relations);
if ($state === Relation::ACCEPTED) {
$edges[] = array($i, $j);
continue;
} else if ($state === Relation::REJECTED) {
continue;
}
}
$distance = dlib_vector_length($face1->descriptor, $face2->descriptor);

if ($distance < $sensitivity) {
Expand Down
4 changes: 2 additions & 2 deletions lib/Controller/RelationController.php
Original file line number Diff line number Diff line change
Expand Up @@ -89,15 +89,15 @@ public function findByPerson(int $personId) {
$proposed = array();
$relations = $this->relationMapper->findFromPerson($this->userId, $personId, RELATION::PROPOSED);
foreach ($relations as $relation) {
$person1 = $this->personMapper->findFromFace($this->userId, $relation->getFace1());
$person1 = $this->personMapper->findFromFace($this->userId, $relation->face1);
if (($person1->getId() !== $personId) && ($mainPerson->getName() !== $person1->getName())) {
$proffer = array();
$proffer['origId'] = $mainPerson->getId();
$proffer['id'] = $person1->getId();
$proffer['name'] = $person1->getName();
$proposed[] = $proffer;
}
$person2 = $this->personMapper->findFromFace($this->userId, $relation->getFace2());
$person2 = $this->personMapper->findFromFace($this->userId, $relation->face2);
if (($person2->getId() !== $personId) && ($mainPerson->getName() !== $person2->getName())) {
$proffer = array();
$proffer['origId'] = $mainPerson->getId();
Expand Down
6 changes: 3 additions & 3 deletions lib/Db/Relation.php
Original file line number Diff line number Diff line change
Expand Up @@ -48,21 +48,21 @@ class Relation extends Entity {
*
* @var int
* */
protected $face1;
public $face1;

/**
* Face id of a face of a person related with $face1
*
* @var int
* */
protected $face2;
public $face2;

/**
* State of two face relation. These are proposed, and can be accepted
* as as the same person, or rejected.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add as comment - "Rejected relations are never proposed again"

*
* @var int
* */
protected $state;
public $state;

}
33 changes: 21 additions & 12 deletions lib/Db/RelationMapper.php
Original file line number Diff line number Diff line change
Expand Up @@ -121,24 +121,33 @@ public function findByUserAsMatrix(string $userId, int $modelId): array {
$matrix = array();
$relations = $this->findByUser($userId, $modelId);
foreach ($relations as $relation) {
$face1 = $relation->getFace1();
$face2 = $relation->getFace2();
$state = $relation->getState();

$row = array();
if (isset($matrix[$face1])) {
$row = $matrix[$face1];
if (isset($matrix[$relation->face1])) {
$row = $matrix[$relation->face1];
}
$row[$face2] = $state;
$matrix[$face1] = $row;
$row[$relation->face2] = $relation->state;
$matrix[$relation->face1] = $row;
}
return $matrix;
}

public function existsOnMatrix(Relation $relation, array $matrix): bool {
$face1 = $relation->getFace1();
$face2 = $relation->getFace2();
public function getStateOnMatrix(int $face1, int $face2, array $matrix): int {
if (isset($matrix[$face1])) {
$row = $matrix[$face1];
if (isset($row[$face2])) {
return $matrix[$face1][$face2];
}
}
if (isset($matrix[$face2])) {
$row = $matrix[$face2];
if (isset($row[$face1])) {
return $matrix[$face2][$face1];
}
}
return Relation::PROPOSED;
}

public function existsOnMatrix(int $face1, int $face2, array $matrix): bool {
if (isset($matrix[$face1])) {
$row = $matrix[$face1];
if (isset($row[$face2])) {
Expand All @@ -160,7 +169,7 @@ public function merge(string $userId, int $modelId, array $relations): int {
try {
$oldMatrix = $this->findByUserAsMatrix($userId, $modelId);
foreach ($relations as $relation) {
if ($this->existsOnMatrix($relation, $oldMatrix))
if ($this->existsOnMatrix($relation->face1, $relation->face2, $oldMatrix))
continue;
$this->insert($relation);
$added++;
Expand Down