Merge branch 'development'

SiLab-Bonn · Sep 24, 2019 · 47f11e4 · 47f11e4
2 parents d2c8c30 + a66b7d3
commit 47f11e4
Show file tree

Hide file tree

Showing 11 changed files with 1,314 additions and 1,477 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -1,27 +1,29 @@
 language: python
 python:
-- 2.7
-- 3.5
+  - 2.7
+  - 3.7
 
 sudo: false
 
 notifications:
   email:
   - pohl@physik.uni-bonn.de
+  - janssen@physik.uni-bonn.de
 
 install:
   - if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
-      wget https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
+      wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh -O miniconda.sh;
     else
       wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
     fi
-  - chmod +x miniconda.sh
-  - bash miniconda.sh -b -p $HOME/miniconda
+  - bash miniconda.sh -b -p "$HOME/miniconda"
   - export PATH="$HOME/miniconda/bin:$PATH"
-  - conda install --yes numpy numba future nose docutils
+  - conda update --yes conda
   - conda info -a
+  - conda install --yes numpy numba nose
   - pip install coverage coveralls
-  - python setup.py develop
+  - pip install -e .
+  - conda list
 
 script:
   - nosetests  # Run nosetests with jitted functions

diff --git a/README.md b/README.md
@@ -1,57 +1,78 @@
-# Pixel Clusterizer [![Build Status](https://travis-ci.org/SiLab-Bonn/pixel_clusterizer.svg?branch=master)](https://travis-ci.org/SiLab-Bonn/pixel_clusterizer) [![Build Status](https://ci.appveyor.com/api/projects/status/github/SiLab-Bonn/pixel_clusterizer)](https://ci.appveyor.com/project/SiLab-Bonn/pixel_clusterizer) [![Coverage Status](https://coveralls.io/repos/github/SiLab-Bonn/pixel_clusterizer/badge.svg?branch=master)](https://coveralls.io/github/SiLab-Bonn/pixel_clusterizer?branch=master)
+# Pixel Clusterizer [![Build Status](https://travis-ci.org/SiLab-Bonn/pixel_clusterizer.svg?branch=master)](https://travis-ci.org/SiLab-Bonn/pixel_clusterizer) [![Build status](https://ci.appveyor.com/api/projects/status/c8jqu9ow696opevf?svg=true)](https://ci.appveyor.com/project/laborleben/pixel-clusterizer) [![Coverage Status](https://coveralls.io/repos/github/SiLab-Bonn/pixel_clusterizer/badge.svg?branch=master)](https://coveralls.io/github/SiLab-Bonn/pixel_clusterizer?branch=master)
 
-Pixel_clusterizer is an easy to use pixel hit-clusterizer for Python. It clusters hits on an event basis in space and time.
-
-The hits have to be defined as a numpy recarray. The array has to have the following fields:
-- event_number
-- frame
-- column
-- row
-- charge
+## Intended Use
 
-or a mapping of the names has to be provided. The data type does not matter.
+Pixel_clusterizer is an easy to use pixel hit clusterizer for Python. It clusters hits connected to unique event numbers in space and time.
 
-The result of the clustering is the hit array extended by the following fields:
-- cluster_ID
-- is_seed
-- cluster_size
-- n_cluster
+The hits must be provided in a numpy recarray. The array must contain the following columns ("fields"):
+- ```event_number```
+- ```frame```
+- ```column```
+- ```row```
+- ```charge```
 
-A new array with cluster information is also created created and has the following fields:
-- event_number
-- ID
-- size
-- charge
-- seed_column
-- seed_row
-- mean_column
-- mean_row
+If the column names are different, a mapping of the names to the default names can be provided. The data type of each column can vary and is not fixed. The ```column```/```row``` values can be either indices (integer, default) or positions (float). ```Charge``` can be either integer or float (default).
 
+After clustering, two new arrays are returned:
+1. The cluster hits array is the hits array extended by the following columns:
+    - ```cluster_ID```
+    - ```is_seed```
+    - ```cluster_size```
+    - ```n_cluster```
+2. The cluster array contains in each row the information about a single cluster. It has the following columns:
+    - ```event_number```
+    - ```ID```
+    - ```n_hits```
+    - ```charge```
+    - ```seed_column```
+    - ```seed_row```
+    - ```mean_column```
+    - ```mean_row```
 
+## Installation
 
-# Installation
+Python 2.7 or Python 3 or higher must be used. There are many ways to install Python, though we recommend using [Anaconda Python](https://www.anaconda.com/distribution/) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html).
 
-The stable code is hosted on PyPI and can be installed by typing:
+### Prerequisites
+
+The following packages are required:
+```
+numpy numba>=0.24.0
+```
+
+### Installation of pixel_clusterizer
 
+The stable code is hosted on PyPI and can be installed by typing:
+```
 pip install pixel_clusterizer
+```
 
-# Usage
+For developer, clone the pixel_clusterizer git repository and use the following command to install pixel_clusterizer:
+```
+pip install -e .
+```
+
+For testing the basic functionality of pixel_clusterizer, execute the following command:
+```
+nosetests pixel_clusterizer
+```
+
+## Usage
 
 ```
 import numpy as np
 
 from pixel_clusterizer import clusterizer
 
-hits = np.ones(shape=(3, ), dtype=clusterizer.hit_data_type)  # Create some data with std. hit data type
+hits = np.ones(shape=(3, ), dtype=clusterizer.default_hits_dtype)  # Create some data with std. hit data type
 
 cr = clusterizer.HitClusterizer()  # Initialize clusterizer
 
-hits_clustered, cluster = cr.cluster_hits(hits)  # Cluster hits
+cluster_hits, clusters = cr.cluster_hits(hits)  # Cluster hits
 
 ```
-Also take a look at the example folder!
+Also please have a look at the ```examples``` folder!
 
-# Test installation
-```
-nosetests pixel_clusterizer
-```
+## Support
+
+Please use GitHub's [issue tracker](https://github.com/SiLab-Bonn/pixel_clusterizer/issues) for bug reports/feature requests/questions.
diff --git a/appveyor.yml b/appveyor.yml
@@ -8,23 +8,21 @@ environment:
     - PYTHON_VERSION: 2.7
       MINICONDA: C:\Miniconda-x64
       PYTHON_ARCH: "64"
-    - PYTHON_VERSION: 3.5
-      MINICONDA: C:\Miniconda35
+    - PYTHON_VERSION: 3.7
+      MINICONDA: C:\Miniconda37
       PYTHON_ARCH: "32"
-    - PYTHON_VERSION: 3.5
-      MINICONDA: C:\Miniconda35-x64
+    - PYTHON_VERSION: 3.7
+      MINICONDA: C:\Miniconda37-x64
       PYTHON_ARCH: "64"
 
-init:
-  - "ECHO %PYTHON_VERSION% %MINICONDA%"
-
 install:
   # Miniconda Python setup + external packages installation
-  - set PATH=%MINICONDA%;%MINICONDA%\\Scripts;%PATH%  # Miniconda is already installed on appveyor: https://github.com/appveyor/ci/issues/359
-  - conda install --yes numpy numba future nose docutils
+  - set PATH=%MINICONDA%;%MINICONDA%\Scripts;%MINICONDA%\Library\bin;%PATH%
+  - conda update --yes conda
   - conda info -a
+  - conda install --yes numpy numba nose
+  - pip install -e .
   - conda list
-  - python setup.py develop  # Install pixel_clusterizer
 
 test_script:
   - nosetests
diff --git a/pixel_clusterizer/__init__.py b/pixel_clusterizer/__init__.py
@@ -1,3 +1,7 @@
 # http://stackoverflow.com/questions/17583443/what-is-the-correct-way-to-share-package-version-with-setup-py-and-the-package
 from pkg_resources import get_distribution
+from pixel_clusterizer.clusterizer import HitClusterizer, default_hits_descr, default_hits_dtype, default_cluster_hits_descr, default_cluster_hits_dtype, default_clusters_descr, default_clusters_dtype
+
+
 __version__ = get_distribution('pixel_clusterizer').version
+_all_ = ["HitClusterizer", "default_hits_dtype", "default_cluster_hits_descr", "default_cluster_hits_dtype", "default_clusters_descr", "default_clusters_dtype"]
diff --git a/pixel_clusterizer/cluster_functions.py b/pixel_clusterizer/cluster_functions.py
@@ -1,5 +1,5 @@
 ''' Fast clustering functions that are compiled in time via numba '''
-
+import numpy as np
 from numba import njit
 
 
@@ -13,8 +13,8 @@ def _new_event(event_number_1, event_number_2):
 def _pixel_masked(hit, array):
     ''' Checks whether a hit (column/row) is masked or not. Array is 2D array with boolean elements corresponding to pixles indicating whether a pixel is disabled or not.
     '''
-    if array.shape[0] > hit["column"] and array.shape[1] > hit["row"]:
-        return array[hit["column"], hit["row"]]
+    if hit["column"] >= 0 and hit["row"] >= 0 and array.shape[0] > int(hit["column"]) and array.shape[1] > int(hit["row"]):
+        return array[int(hit["column"]), int(hit["row"])]
     else:
         return False
 
@@ -30,25 +30,27 @@ def _pixel_masked(hit, array):
 
 
 @njit()
-def _finish_cluster(hits, clusters, cluster_size, cluster_hit_indices, cluster_index, cluster_id, charge_correction, noisy_pixels, disabled_pixels):
+def _finish_cluster(hits, clusters, cluster_size, cluster_hit_indices, cluster_index, cluster_id, charge_correction, charge_weighted_clustering, noisy_pixels, disabled_pixels):
     ''' Set hit and cluster information of the cluster (e.g. number of hits in the cluster (cluster_size), total cluster charge (charge), ...).
     '''
     cluster_charge = 0
-    max_cluster_charge = -1
-    # necessary for charge weighted hit position
-    total_weighted_column = 0
-    total_weighted_row = 0
-
-    for i in range(cluster_size):
-        hit_index = cluster_hit_indices[i]
-        if hits[hit_index]['charge'] > max_cluster_charge:
+    seed_charge = -1
+    total_column = 0
+    total_row = 0
+
+    for hit_index in cluster_hit_indices:
+        if hits[hit_index]['charge'] > seed_charge:
             seed_hit_index = hit_index
-            max_cluster_charge = hits[hit_index]['charge']
+            seed_charge = hits[hit_index]['charge']
         hits[hit_index]['is_seed'] = 0
         hits[hit_index]['cluster_size'] = cluster_size
-        # include charge correction in sum
-        total_weighted_column += hits[hit_index]['column'] * (hits[hit_index]['charge'] + charge_correction)
-        total_weighted_row += hits[hit_index]['row'] * (hits[hit_index]['charge'] + charge_correction)
+        if charge_weighted_clustering:
+            # include charge correction in sum
+            total_column += hits[hit_index]['column'] * (hits[hit_index]['charge'] + charge_correction)
+            total_row += hits[hit_index]['row'] * (hits[hit_index]['charge'] + charge_correction)
+        else:
+            total_column += hits[hit_index]['column']
+            total_row += hits[hit_index]['row']
         cluster_charge += hits[hit_index]['charge']
         hits[hit_index]['cluster_ID'] = cluster_id
 
@@ -59,9 +61,13 @@ def _finish_cluster(hits, clusters, cluster_size, cluster_hit_indices, cluster_i
     clusters[cluster_index]["charge"] = cluster_charge
     clusters[cluster_index]['seed_column'] = hits[seed_hit_index]['column']
     clusters[cluster_index]['seed_row'] = hits[seed_hit_index]['row']
-    # correct total charge value and calculate mean column and row
-    clusters[cluster_index]['mean_column'] = float(total_weighted_column) / (cluster_charge + cluster_size * charge_correction)
-    clusters[cluster_index]['mean_row'] = float(total_weighted_row) / (cluster_charge + cluster_size * charge_correction)
+    if charge_weighted_clustering:
+        # correct total charge value and calculate mean column and row
+        clusters[cluster_index]['mean_column'] = float(total_column) / (cluster_charge + cluster_size * charge_correction)
+        clusters[cluster_index]['mean_row'] = float(total_row) / (cluster_charge + cluster_size * charge_correction)
+    else:
+        clusters[cluster_index]['mean_column'] = float(total_column) / cluster_size
+        clusters[cluster_index]['mean_row'] = float(total_row) / cluster_size
 
     # Call end of cluster function hook
     _end_of_cluster_function(
@@ -102,11 +108,11 @@ def _hit_ok(hit, min_hit_charge, max_hit_charge):
     ''' Check if given hit is withing the limits.
     '''
     # Omit hits with charge < min_hit_charge
-    if hit['charge'] < min_hit_charge:
+    if min_hit_charge is not None and hit['charge'] < min_hit_charge:
         return False
 
     # Omit hits with charge > max_hit_charge
-    if max_hit_charge != 0 and hit['charge'] > max_hit_charge:
+    if max_hit_charge is not None and hit['charge'] > max_hit_charge:
         return False
 
     return True
@@ -139,8 +145,8 @@ def _is_in_max_difference(value_1, value_2, max_difference):
     Circumvents numba bug #1653
     '''
     if value_1 <= value_2:
-        return value_2 - value_1 <= max_difference
-    return value_1 - value_2 <= max_difference
+        return (np.nextafter(value_2, value_1) - np.nextafter(value_1, value_2)) <= max_difference
+    return (np.nextafter(value_1, value_2) - np.nextafter(value_2, value_1)) <= max_difference
 
 
 # @njit()
@@ -158,27 +164,19 @@ def _is_in_max_difference(value_1, value_2, max_difference):
 
 
 @njit()
-def _cluster_hits(hits, clusters, assigned_hit_array, cluster_hit_indices, column_cluster_distance, row_cluster_distance, frame_cluster_distance, min_hit_charge, max_hit_charge, ignore_same_hits, noisy_pixels, disabled_pixels):
+def _cluster_hits(hits, clusters, assigned_hit_array, cluster_hit_indices, min_hit_charge, max_hit_charge, charge_correction, charge_weighted_clustering, column_cluster_distance, row_cluster_distance, frame_cluster_distance, ignore_same_hits, noisy_pixels, disabled_pixels):
     ''' Main precompiled function that loopes over the hits and clusters them
     '''
     total_hits = hits.shape[0]
     if total_hits == 0:
         return 0  # total clusters
-    max_cluster_hits = cluster_hit_indices.shape[0]
 
     if total_hits != clusters.shape[0]:
         raise ValueError("hits and clusters must be the same size")
 
     if total_hits != assigned_hit_array.shape[0]:
         raise ValueError("hits and assigned_hit_array must be the same size")
 
-    # Correction for charge weighting
-    # Some chips have non-zero charge for a charge value of zero, charge needs to be corrected to calculate cluster center correctly
-    if min_hit_charge == 0:
-        charge_correction = 1
-    else:
-        charge_correction = 0
-
     # Temporary variables that are reset for each cluster or event
     start_event_hit_index = 0
     start_event_cluster_index = 0
@@ -222,7 +220,7 @@ def _cluster_hits(hits, clusters, assigned_hit_array, cluster_hit_indices, colum
         assigned_hit_array[i] = 1
         cluster_size = 1  # actual cluster has one hit so far
 
-        for j in cluster_hit_indices:  # Loop over all hits of the actual cluster; cluster_hit_indices is updated within the loop if new hit are found
+        for j in cluster_hit_indices:  # Loop over all hits of the actual cluster; cluster_hit_indices is updated within the loop if new hits are found
             if j < 0:  # There are no more cluster hits found
                 break
 
@@ -247,8 +245,6 @@ def _cluster_hits(hits, clusters, assigned_hit_array, cluster_hit_indices, colum
                 if _is_in_max_difference(hits[j]['column'], hits[k]['column'], column_cluster_distance) and _is_in_max_difference(hits[j]['row'], hits[k]['row'], row_cluster_distance) and _is_in_max_difference(hits[j]['frame'], hits[k]['frame'], frame_cluster_distance):
                     if not ignore_same_hits or hits[j]['column'] != hits[k]['column'] or hits[j]['row'] != hits[k]['row']:
                         cluster_size += 1
-                        if cluster_size > max_cluster_hits:
-                            raise IndexError('cluster_hit_indices is too small to contain all cluster hits')
                         cluster_hit_indices[cluster_size - 1] = k
                         assigned_hit_array[k] = 1
 
@@ -264,10 +260,11 @@ def _cluster_hits(hits, clusters, assigned_hit_array, cluster_hit_indices, colum
                 hits=hits,
                 clusters=clusters,
                 cluster_size=cluster_size,
-                cluster_hit_indices=cluster_hit_indices,
+                cluster_hit_indices=cluster_hit_indices[:cluster_size],
                 cluster_index=start_event_cluster_index + event_cluster_index,
                 cluster_id=event_cluster_index,
                 charge_correction=charge_correction,
+                charge_weighted_clustering=charge_weighted_clustering,
                 noisy_pixels=noisy_pixels,
                 disabled_pixels=disabled_pixels)
             event_cluster_index += 1