Skip to content

Commit

Permalink
codecFlags, no codec aspects in object mapper, javadoc changes and te…
Browse files Browse the repository at this point in the history
…st case changes, version change to 1.5
  • Loading branch information
m-manu committed Jan 21, 2017
1 parent 57aaf24 commit 5f2b2a4
Show file tree
Hide file tree
Showing 20 changed files with 410 additions and 260 deletions.
69 changes: 46 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,37 +7,50 @@ This compact utility library is an annotation based *object mapper* for HBase (w
* for use in Hadoop MapReduce jobs that read from and/or write to HBase tables
* and write efficient unit-tests for `Mapper` and `Reducer` classes
* define *data access objects* for entities that map to HBase rows
* for random single/range/bulk access of rows of an HBase table
* for single/range/bulk access of rows of an HBase table

## Usage
Let's say you've an HBase table `citizens` with row-key format of `country_code#UID`. Now, let's say your table is created with two column families `main` and `optional`, which may have columns like `uid`, `name`, `salary` etc.
Let's say you've an HBase table `citizens` with row-key format of `country_code#UID`. Now, let's say your table is created with three column families `main`, `optional` and `tracked`, which may have columns `uid`, `name`, `salary` etc.

This library enables to you represent your HBase table as a bean-like class, as below:

```java
@HBTable("citizens")
public class Citizen implements HBRecord<String> {

@HBRowKey
private String countryCode;

@HBRowKey
private Integer uid;

@HBColumn(family = "main", column = "name")
private String name;

@HBColumn(family = "optional", column = "age")
private Short age;

@HBColumn(family = "optional", column = "salary")
private Integer sal;
@HBColumn(family = "optional", column = "flags")
private Map<String, Integer> extraFlags;

@HBColumn(family = "optional", column = "custom_details")
private Map<String, Integer> customDetails;

@HBColumn(family = "optional", column = "dependents")
private Dependents dependents;
@HBColumnMultiVersion(family = "optional", column = "phone_number")
private NavigableMap<Long, Integer> phoneNumber; // Multi-versioned column. This annotation enables you to fetch multiple versions of column values


@HBColumnMultiVersion(family = "tracked", column = "phone_number")
private NavigableMap<Long, Integer> phoneNumber;

@HBColumn(family = "optional", column = "pincode", codecFlags = {@Flag(name = "serializeAsString", value = "true")})
private Integer pincode;

@Override
public String composeRowKey() {
return String.format("%s#%d", countryCode, uid);
}

@Override
public void parseRowKey(String rowKey) {
String[] pieces = rowKey.split("#");
this.countryCode = pieces[0];
Expand All @@ -47,22 +60,32 @@ public class Citizen implements HBRecord<String> {
// Constructors, getters and setters
}
```
where, following things are provided:
That is,

* Name of the HBase table (`citizens`) that the class maps to, using `HBTable` annotation
* Data type to map row keys to (`String`) as generic type parameter for `HBRecord` interface
* Logics for conversion of HBase row key to Java types and vice-versa by implmenting `parseRowKey` and `composeRowKey` methods
* Names of columns and their column families using `HBColumn` annotation

This library enables you to represent rows of `citizens` HBase table as instances of `Citizen` class.
* The above class `Citizen` represents the HBase table `citizens`, using the `@HBTable` annotation.
* Logics for conversion of HBase row key to member variables of `Citizen` objects and vice-versa are implemented using `parseRowKey` and `composeRowKey` methods respectively.
* The data type representing row key is the type parameter to `HBRecord` generic interface (in above case, `String`). Fields that form row key are annotated with `@HBRowKey`.
* Names of columns and their column families are specified using `@HBColumn` or `@HBColumnMultiVersion` annotations.
* The class may contain fields of simple data types (e.g. `String`, `Integer`), generic data types (e.g. `Map`, `List`) or even your custom class.
* The `@HBColumnMultiVersion` annotation allows you to map multiple versions of column in a `NavigableMap<Long, ?>`. In above example, field `phoneNumber` is mapped to column `phone_number` within the column family `tracked` (which is configured for multiple versions)

See source files [Citizen.java](./src/test/java/com/flipkart/hbaseobjectmapper/entities/Citizen.java) and [Employee.java](./src/test/java/com/flipkart/hbaseobjectmapper/entities/Employee.java) for detailed examples.

Now, for above definition of your `Citizen` class,
Now, this library enables you to represent rows of `citizens` HBase table as instances of `Citizen` class. For above definition of your `Citizen` class,

* you can use methods in `HBObjectMapper` class to convert `Citizen` objects to HBase's `Put` and `Result` objects and vice-versa
* you can inherit from class `AbstractHBDAO` that contains methods like `get` (for random single/bulk/range access of rows), `persist` (for writing rows) and `delete` (for deleting rows)

### Serialization / Deserialization

* The default codec of this library has the following behavior:
* uses HBase's native methods to serialize objects of data types `Boolean`, `Short`, `Integer`, `Long`, `Float`, `Double`, `String` and `BigDecimal`
* uses [Jackson's JSON serializer](http://wiki.fasterxml.com/JacksonHome) for all other data types
* serializes `null` as `null`
* To control/modify serialization/deserialization behavior, you may define your own codec (by implementing the `Codec` interface) or extend default codec (by extending `BestSuitCodec` class).
* The optional parameter `codecFlag` (supported by both `@HBColumn` and `@HBColumnMultiVersion` annotations) can be used to pass custom flags to the underlying codec.
* The default codec takes a flag `serializeAsString` (like in the above example), which when set to `true`, serializes even numerical fields as a `String`. Your custom codec may take other such flags.

## MapReduce use-cases

### Mapper
Expand Down Expand Up @@ -187,8 +210,8 @@ Citizen[] ape = citizenDao.get(new String[] {"IND#1", "IND#2"}); //bulk get
// In below, note that "IND#1" is inclusive and "IND#5" is exclusive
List<Citizen> lpe = citizenDao.get("IND#1", "IND#5"); //range get

// for row keys in range ["IND#1", "IND#5"), fetch 3 versions of field 'phoneNumberHistory'
NavigableMap<String /* row key */, NavigableMap<Long /* timestamp */, Object /* column value */>> phoneNumberHistory
// for row keys in range ["IND#1", "IND#5"), fetch 3 versions of field 'phoneNumberHistory' as a NavigableMap<row key, NavigableMap<timestamp, column value>>:
NavigableMap<String, NavigableMap<Long, Object>> phoneNumberHistory
= citizenDao.fetchFieldValues("IND#1", "IND#5", "phoneNumberHistory", 3);
//(bulk variant of above range method is also available)

Expand All @@ -204,7 +227,7 @@ citizenDao.delete("IND#2"); // Delete a row by it's row key
citizenDao.getHBaseTable() // returns HTable instance (in case you want to directly play around)

```
(see [TestsAbstractHBDAO.java](./src/test/java/com/flipkart/hbaseobjectmapper/TestsAbstractHBDAO.java) for a more detailed examples)
(see [TestsAbstractHBDAO.java](./src/test/java/com/flipkart/hbaseobjectmapper/TestsAbstractHBDAO.java) for more detailed examples)

**Please note:** Since we're dealing with HBase (and not an OLTP data store), fitting a classical ORM paradigm may not make sense. So this library doesn't intend to evolve as a full-fledged ORM. However, if you do intend to use HBase via ORM, I suggest you use [Apache Phoenix](https://phoenix.apache.org/).

Expand All @@ -223,21 +246,21 @@ Add below entry within the `dependencies` section of your `pom.xml`:
<dependency>
<groupId>com.flipkart</groupId>
<artifactId>hbase-object-mapper</artifactId>
<version>1.4.1</version>
<version>1.5</version>
</dependency>
```
(See artifact details for [com.flipkart:hbase-object-mapper:1.4.1]((http://search.maven.org/#artifactdetails%7Ccom.flipkart%7Chbase-object-mapper%7C1.4.1%7Cjar)) on **Maven Central**)
See artifact details for [com.flipkart:hbase-object-mapper](http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22com.flipkart%22%20AND%20a%3A%22hbase-object-mapper%22) on **Maven Central**

## How to build?
To build this project, follow below steps:

* Do a `git clone` of this repository
* Checkout latest stable version `git checkout v1.4.1`
* Checkout latest stable version `git checkout v1.5`
* Execute `mvn clean install` from shell

Currently, projects that use this library are running on [Hortonworks Data Platform v2.2](http://hortonworks.com/blog/announcing-hdp-2-2/) (corresponds to Hadoop 2.6 and HBase 0.98). However, if you're using a different distribution of Hadoop (like [Cloudera](http://www.cloudera.com/)) or if you are using a different version of Hadoop, you may change the versions in [pom.xml](./pom.xml) to desired ones and build the project.

**Please note**: Test cases are very comprehensive - they even spin an [in-memory HBase test cluster](https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java) to run data access related test cases (near-realworld scenario). So, build times can sometimes be longer.
**Please note**: Test cases are very comprehensive - they even spin an [in-memory HBase test cluster](https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java) to run data access related test cases (near-realworld scenario). So, build times can sometimes be longer, depending on your machine configuration.

## Releases

Expand All @@ -249,6 +272,6 @@ If you intend to request a feature or report a bug, you may use [Github Issues f

## License

Copyright 2016 Flipkart Internet Pvt Ltd.
Copyright 2017 Flipkart Internet Pvt Ltd.

Licensed under the [Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0) (the "License"). You may not use this product or it's source code except in compliance with the License.
16 changes: 6 additions & 10 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,16 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<name>HBase Object Mapper</name>
<description>A Java-annotation based object mapper that helps convert your bean-like objects to HBase rows
(and vice-versa) for use in MapReduce jobs involving HBase tables (and their unit tests). Helps define Data
Access Objects for HBase entities
<description>
Java-annotation based compact utility library for HBase that helps you:
[1] convert objects of your bean-like classes to HBase rows and vice-versa (for use in writing MapReduce jobs for HBase tables and writing high-quality unit test cases)
[2] define 'Data Access Object' classes for random access of HBase rows
</description>
<modelVersion>4.0.0</modelVersion>
<groupId>com.flipkart</groupId>
<artifactId>hbase-object-mapper</artifactId>
<version>1.4.1</version>
<url>https://github.com/flipkart-incubator/hbase-object-mapper#readme</url>
<version>1.5</version>
<url>https://flipkart-incubator.github.io/hbase-object-mapper/</url>
<scm>
<url>https://github.com/flipkart-incubator/hbase-object-mapper</url>
</scm>
Expand Down Expand Up @@ -59,11 +60,6 @@
<artifactId>hbase-client</artifactId>
<version>${version.hbase}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>14.0.1</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@
/**
* A <i>Data Access Object</i> class that enables simple random access (read/write) of HBase rows.
* <p>
* Please note: This class is not thread-safe
* Please note: Unliked the {@link HBObjectMapper} class, this class is <b>not</b> thread-safe
* </p>
*
* @param <R> Data type of row key (must be {@link Comparable} with itself and must be {@link Serializable})
* @param <T> Entity type that maps to an HBase row (must implement {@link HBRecord} interface)
* @param <R> Data type of row key (must be '{@link Comparable} with itself' and must be {@link Serializable})
* @param <T> Entity type that maps to an HBase row (this type must have implemented {@link HBRecord} interface)
*/
public abstract class AbstractHBDAO<R extends Serializable & Comparable<R>, T extends HBRecord<R>> {

Expand Down Expand Up @@ -316,7 +316,7 @@ private static void populateFieldValuesToMap(Field field, Result result, Map<Str
final String rowKey = Bytes.toString(CellUtil.cloneRow(cell));
if (!map.containsKey(rowKey))
map.put(rowKey, new TreeMap<Long, Object>());
map.get(rowKey).put(cell.getTimestamp(), hbObjectMapper.byteArrayToValue(CellUtil.cloneValue(cell), fieldType, hbColumn.serializeAsString()));
map.get(rowKey).put(cell.getTimestamp(), hbObjectMapper.byteArrayToValue(CellUtil.cloneValue(cell), fieldType, hbColumn.codecFlags()));
}
}

Expand Down
12 changes: 12 additions & 0 deletions src/main/java/com/flipkart/hbaseobjectmapper/Flag.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
package com.flipkart.hbaseobjectmapper;

/**
* A flag for {@link com.flipkart.hbaseobjectmapper.codec.Codec Codec} (specify parameter name and value)
* <p>
* This is to be used exclusively for input to {@link HBColumn#codecFlags() codecFlags} parameter of {@link HBColumn} and {@link HBColumnMultiVersion} annotations
*/
public @interface Flag {
String name();

String value();
}
9 changes: 7 additions & 2 deletions src/main/java/com/flipkart/hbaseobjectmapper/HBColumn.java
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
package com.flipkart.hbaseobjectmapper;

import java.io.Serializable;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
import java.lang.reflect.Type;
import java.util.Map;

/**
* Maps an entity field to an HBase column
Expand All @@ -23,7 +26,9 @@
String column();

/**
* (Applicable to numeric fields) Store field value in it's string representation (e.g. (int)560034 is stored as "560034")
* <b>[optional]</b> flags to be passed to codec's {@link com.flipkart.hbaseobjectmapper.codec.Codec#serialize(Serializable, Map) serialize} and {@link com.flipkart.hbaseobjectmapper.codec.Codec#deserialize(byte[], Type, Map) deserialize} methods
* <p>
* Note: These flags will be passed as a <code>Map&lt;String, String&gt;</code> (param name and param value)
*/
boolean serializeAsString() default false;
Flag[] codecFlags() default {};
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,15 @@
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
import java.lang.reflect.Type;
import java.util.Map;

/**
* Maps an entity field of type <code>NavigableMap&lt;Long, T&gt;</code> to an HBase column (where <code>T</code> is a {@link Serializable} type)
* Maps an entity field of type <code>NavigableMap&lt;Long, T&gt;</code> to an HBase column whose data type is represented as data type <code>T</code>.
* <p>
* As the name explains, this annotation is the multi-version variant of {@link HBColumn}.
* <p>
* <b>Please note</b>: <code>T</code> must be {@link Serializable}
*/
@Target(ElementType.FIELD)
@Retention(RetentionPolicy.RUNTIME)
Expand All @@ -24,7 +30,10 @@
String column();

/**
* (Applicable to numeric fields) Store field value in it's string representation (e.g. (int)560034 is stored as "560034")
* <b>[optional]</b> flags to be passed to codec's {@link com.flipkart.hbaseobjectmapper.codec.Codec#serialize(Serializable, Map) serialize} and {@link com.flipkart.hbaseobjectmapper.codec.Codec#deserialize(byte[], Type, Map) deserialize} methods
* <p>
* Note: These flags will be passed as a <code>Map&lt;String, String&gt;</code> (param name and param value)
*/
boolean serializeAsString() default false;
Flag[] codecFlags() default {};

}
Loading

0 comments on commit 5f2b2a4

Please sign in to comment.