HBase tutorial for beginners   December 22nd, 2008

This post is old. Most of the code does not compile at the moment!

First of all, HBase is a column oriented database. However, you have to forget everything you have learned about tables, columns and rows in the RDBMS world. The data in an HBase instance is layed out more like a hashtable, and the data is immutable. Whenever you update the data, you are actually just creating a new version of it.

This tutorial will be very hands-on, with not too much explanation. There are a number of articles where the column oriented databases are described in details. Check out my delicious tag for some good ones, for instance jimbojw.com’s excellent introduction

I used Apple OSX 10.5.6 in this tutorial, I am not sure if this will work on windows and linux.

The goal for this tutorial is to create a model for a blog with integration from a java program.

Get started

  • Download from the latest stable release from apache. I went with the hbase-0.18.1 release.
  • Unpack it, for instance to ~/hbase
  • Edit ~/hbase/conf/hbase-env.sh and set the correct JAVA_HOME variable.
  • Start hbase by running ~/hbase/bin/start-hbase.sh

Create a table

  • Start the hbase shell by running ~/hbase/bin/hbase shell
  • Run create ‘blogposts’, ‘post’, ‘image’ in the shell

Now you have a table called blogposts, with a post, and a image family. These families are “static” like the columns in the RDBMS world.

Add some data to the table

Run the following commands in the shell:

  • put ‘blogposts’, ‘post1′, ‘post:title’, ‘Hello World’
  • put ‘blogposts’, ‘post1′, ‘post:author’, ‘The Author’
  • put ‘blogposts’, ‘post1′, ‘post:body’, ‘This is a blog post’
  • put ‘blogposts’, ‘post1′, ‘image:header’, ‘image1.jpg’
  • put ‘blogposts’, ‘post1′, ‘image:bodyimage’, ‘image2.jpg’

Look at the data

Run get ‘blogposts’, ‘post1′ in the shell. This should output something like this.

COLUMN CELL
image:bodyimage timestamp=1229953133260, value=image2.jpg
image:header timestamp=1229953110419, value=image1.jpg
post:author timestamp=1229953071910, value=The Author
post:body timestamp=1229953072029, value=This is a blog post
post:title timestamp=1229953071791, value=Hello World

Summary part1

So, what have we accomplished so far? We have created a table and added one ‘record’ to it. This record consists of the blogpost itself, and the images attached to it. So, how do we retrieve those data from a java application?

Integrate with HBase from Java

In order to integrate with HBase you will need the following jar files in your classpath:

  • commons-logging-1.0.4.jar
  • hadoop-0.18.1-core.jar
  • hbase-0.18.1.jar
  • log4j-1.2.13.jar

All these are found within ~/hbase/lib and ~/hbase

Ok. Here’s the java code:

import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.io.RowResult;

import java.util.HashMap;
import java.util.Map;
import java.io.IOException;

public class HBaseConnector {

public static Map retrievePost(String postId) throws IOException {
HTable table = new HTable(new HBaseConfiguration(), "blogposts");
Map post = new HashMap();

RowResult result = table.getRow(postId);

for (byte[] column : result.keySet()) {
post.put(new String(column), new String(result.get(column).getValue()));
}
return post;
}

public static void main(String[] args) throws IOException {
Map blogpost = HBaseConnector.retrievePost("post1");
System.out.println(blogpost.get("post:title"));
System.out.println(blogpost.get("post:author"));
}
}

This code should print out ‘Hello World’ and ‘The Author’.

Happy coding, check out HBase’s javadoc for more examples.

Please leave feedback in the comments.

Tags: , ,
This entry was posted on Monday, December 22nd, 2008 at 15:42 and is filed under Programming. You can follow any responses to this entry through the RSS 2.0 feed.Both comments and pings are currently closed.

32 Responses

February 4th, 2009 at 19:40
AlexAxe Says:

Hello,
Great job. But not enought info. Where can i read more?

Thank you
AlexAxe

February 6th, 2009 at 03:44
RoninJoe Says:

Nice post. There are some more examples on the HBase wiki, but this is a nice “hello world”.

Thanks,
Joe

February 13th, 2009 at 22:33
Amandeep Khurana Says:

Good info to get someone started.

Thanks!

March 28th, 2009 at 19:01
check_writer Says:

what else do i need to do to make sure my configuration is read?

June 17th, 2009 at 18:30
Kevin Says:

Hello test:
I want to download hbase 0.19.1.jar. Where can I download it?
I want to use it on Eclipse for Java.
How many jars I need to prepare?

July 3rd, 2009 at 15:25
工作=磨練? » [yes]Hbase tutorial Says:

[...] Hbase Tutorial for beginners http://ole-martin.net/hbase-tutorial-for-beginners/ [...]

July 15th, 2009 at 19:42
Baby Steps into HBase « write-only by Gregg Lind Says:

[...] Ole-Martin Mørk’s instructions was a breeze!  Even though I know almost nothing about Java and the Java environment, I managed [...]

July 24th, 2009 at 10:17
Christine Says:

Very nice,
it helped me for the start.
thank you

December 1st, 2009 at 04:56
Ian Kallen Says:

Thanks for posting a real simple example. Since I’ve been curious about JRuby too, I ported the programmatic access example to ruby, see http://gist.github.com/246032

December 16th, 2009 at 18:02
hferreira Says:

Hi,

I’m a newbie on this, just started today. How do I run the Java class? In which machine? In the java I don’t see any parameters like the IP or the port. There’s any special command to run java class or java jar files on HBase?

Thanks in Advace

March 8th, 2010 at 02:39
Aaron Says:

Thanks for putting this together. And the links you collected at http://delicious.com/olemartin/hbase were a great help in me understanding hbase. Thanks!

March 9th, 2010 at 12:56
Dobrynin Says:

Эта блестящая фраза придется как раз кстати

April 13th, 2010 at 10:06
Amol Says:

import org.apache.hadoop.hbase.io.RowResult;
it has been Deprecated.
I am unable to use this code please help

June 14th, 2010 at 06:37
RahulRinayat Says:

Chala Thanks Ante,nuvu baga parson undi.Keep Uploading Such a nice Tutorial

June 20th, 2010 at 05:23
Rameshkumar Says:

nice post, give more…..

July 15th, 2010 at 14:52
Amol Gupta Says:

@Ramesh Kumar
Refer to the article in Linux for you june 2010.

August 23rd, 2010 at 22:42
Sujee Maniyam Says:

I have a Hbase MapReduce tutorial here.
http://sujee.net/tech/articles/hbase-map-reduce-freq-counter/

September 28th, 2010 at 20:18
Ashok Says:

Hi folks,

I am getting the error. Can u please clarify:

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/client/HTable

November 17th, 2010 at 03:17
HBase tutorial for beginners - CLucene教程 Says:

[...] 原帖地址:http://ole-martin.net/hbase-tutorial-for-beginners/ [...]

December 5th, 2010 at 09:37
Preshit Says:

Great Post for beginners. Best hello world program. Thank You.

May 2nd, 2011 at 13:51
Misty Says:

Can you please guide me how to use HBase in Java programs, I tried running your example but it gives this error

11/05/02 17:44:40 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 0 time(s).
11/05/02 17:44:42 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 1 time(s).
11/05/02 17:44:44 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 2 time(s).
11/05/02 17:44:46 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 3 time(s).
11/05/02 17:44:48 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 4 time(s).
11/05/02 17:44:50 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 5 time(s).
11/05/02 17:44:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 6 time(s).
11/05/02 17:44:54 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 7 time(s).
11/05/02 17:44:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 8 time(s).
11/05/02 17:44:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:60000. Already tried 9 time(s).
11/05/02 17:44:59 INFO client.HConnectionManager$TableServers: Attempt 0 of 10 failed with . Retrying after sleep of 2000

please let me know where am i going wrong. please suggest a simple way to use HBase

June 6th, 2011 at 21:11
Ramesh Babu Says:

Check your IP address in the Host File and it should match the local machine IP

August 18th, 2011 at 09:08
HBase tutorial for beginners « meisterxu Says:

[...] HBase tutorial for beginners From ole-martin.net » HBase tutorial for beginners – a blog by Ole-Martin Mørk. [...]

December 2nd, 2011 at 06:04
hbase | riding in tech world. Says:

[...] good tutorial for beginners: http://ole-martin.net/hbase-tutorial-for-beginners/ [...]

December 21st, 2011 at 13:10
Rahul Says:

Hi
Can anybody tell me how to compile and run the HBaseConnecto program which is given in above tutorial .
I tried a lot but not successful if anybody knows any link please share.
Kindly Help.
Thanks

December 21st, 2011 at 13:19
ole-martin Says:

This post is old Rahul. There might have been some changes to the hbase api since then.
What does your compiler say?

February 21st, 2012 at 07:51
tahreem Says:

hey ive been trying dis code but its not executing the compiler says
package org.apache.hadoop.hbase.client does not exist.
import org.apache.hadoop.hbase.client.HTable .

package org.apache.hadoop.hbase does not exist.
import org.apache.hadoop.hbase.HBaseConfiguration.

package org.apache.hadoop.hbase.io does not exist.
import org.apache.hadoop.hbase.io.RowResult.

plz help asap.

March 8th, 2012 at 10:38
amsal Says:

hi
i have created a table in hbase with 12 columns in each row and each column has 8 qualifiers.when i try to read complete row it returns correct value for 1:1 in row 1 but returns null for 1:2
it reads all the columns correctly from 2 to 12….
plz help how to solve this problem
i m using this code for reading….it is inside for loop thar runs from 1 to 10..

train[0][i] = Double.parseDouble(Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)),Bytes.toBytes(“1″))));

train[1][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“2″))));

train[2][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“3″))));

train[3][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“4″))));

train[4][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“5″))));

train[5][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“6″))));

train[6][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“7″))));

train[7][i] = Double.parseDouble (Bytes.toString (r.getValue (Bytes.toBytes(Integer.toString(i)), Bytes.toBytes(“8″))));

April 12th, 2012 at 23:26
timekiller Says:

The code above is very out of date. I made the minimum number of necessary adjustments to just get it compiling and running with HBase 0.90.4. It’s clearly in poor form now in a number of ways, but I figured I’d post it anyway, you can clean it up from here…

import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.util.Bytes;

import java.util.HashMap;
import java.util.Map;
import java.io.IOException;

public class HBaseConnector {

public static Map retreivePost(String postId) throws IOException {
HTable table = new HTable(new HBaseConfiguration(), “blogposts”);
Map post = new HashMap();

Result result = table.get(new Get(Bytes.toBytes(postId)));

for (KeyValue column : result.list()) {

post.put(new String(column.getFamily()) + “:” + new String(column.getQualifier()), new String(column.getValue()));
}
return post;
}

public static void main(String[] args) throws IOException {
Map blogpost = HBaseConnector.retreivePost(“post1″);
System.out.println(blogpost.get(“post:title”));
System.out.println(blogpost.get(“post:author”));
}
}

April 16th, 2012 at 06:49
kiran Says:

Its very Good Tutorial, it would be great help if i get more good explanations as you mentioned

April 2nd, 2013 at 05:41
manjula Says:

I get the following exception when i run the above code in eclipse. I also added the jars mentioned. Please reply ASAP.

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException
at HBaseConnector.retrievePost(HBaseConnector.java:21)
at HBaseConnector.main(HBaseConnector.java:36)
Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
… 2 more

April 3rd, 2013 at 09:53
Ole-Martin Says:

As mentioned at the top of the post, this post is old and the code doesnt compile.