Hashing is part of Data Vault 2.0. Hash Keys are used to
- provide a surrogate for business keys
- detect changes in fields in Satellite tables.
select upper(rawtohex(DBMS_OBFUSCATION_TOOLKIT.MD5(input_string => upper('Hash Function Test')))) as Hash
from dual;
HASH
—————————————————————————————————————————-
477203E7062D1C485D7BF526FCF5CBAF
Example with SQL Server:
select UPPER(CONVERT(char(32), HASHBYTES('MD5', UPPER('Hash Function Test')), 2)) as Hash
DB2 has no built-in MD5 Hash Function that produces compatible results. DB2 has an Oracle compatibility mode that provides some of the functions from Oracle. Unfortunately, DBMS_OBFUSCATION_TOOLKIT is not part of the Oracle compatibility mode. Therefore it is necessary to extend DB2 functionality by implementing a UDF.
There are several Open Source MD5 implementations in C available that could be used to create a DB2 UDF. A ready-for-use DB2 approach is provided by Helmut K. C. Tessarek on github.
He provides C Code and DDL scripts to create DB2 UDFs for Linux and Windows. The complete installation is documented in a manual available on github. Installation steps for Linux in short are:
- If not already part of your Linux system, install apr and apr-util from http://apr.apache.org/
- Install Helmut K. C. Tessarek’s functions by running makertn
Variable APRPATH and APUPATH must be set within the script (e.g. /usr/local/apr/bin) - Run register.dll to create the UDF functions within DB2.
ldconfig -p |grep apr
libexpat.so.0 (libc6,x86-64) => /usr/local/apr/lib/libexpat.so.0
libexpat.so (libc6,x86-64) => /usr/local/apr/lib/libexpat.so
libaprutil-1.so.0 (libc6,x86-64) => /usr/local/apr/lib/libaprutil-1.so.0
libaprutil-1.so (libc6,x86-64) => /usr/local/apr/lib/libaprutil-1.so
libapr-1.so.0 (libc6,x86-64) => /usr/local/apr/lib/libapr-1.so.0
libapr-1.so (libc6,x86-64) => /usr/local/apr/lib/libapr-1.so
If the the libs are not loaded, the path to apr/apr-util libs (e.g. /usr/local/apr/lib) has to be entered into file “/etc/ld.so.conf”. The command “ldconfig” will load the libs afterwards.
select UPPER(php_md5(UPPER('Hash Function Test')))
from sysibm.sysdummy1;
——————————–
477203E7062D1C485D7BF526FCF5CBAF
The result of “php_md5” is char(32). The other provided md5 function is a different implementation and produces different results.
Update: Db2 LUW 11.1 onwards has a hash function: HASH (<string>, algorithm). MD5 is algorithm = 1: HASH(‘Test’, 1)