Test case
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 (Santiago)
$ env | grep NLS_LANG
NLS_LANG=AMERICAN_AMERICA.UTF8
NLS_LANG=AMERICAN_AMERICA.UTF8
$ sqlplus test/test
SQL*Plus: Release 12.1.0.2.0 Production on Tue Apr 21 17:26:00 2015
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> ALTER SESSION SET NLS_SORT=ESTONIAN;
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> ALTER SESSION SET NLS_SORT=ESTONIAN;
SQL>
SELECT
MY_CHAR,
CASE
WHEN
REGEXP_LIKE(MY_CHAR,'^[A-Z][a-zA-Z0-9_]{0,}$')
THEN
1
ELSE
0
END
AS
REGEXP_RESULT
FROM
(
SELECT
CHR(ASCII('A')-1+ROWNUM)
AS
MY_CHAR FROM
DUAL CONNECT
BY
LEVEL
<=
26
);
Result
MY_CHAR | REGEXP_RESULT |
A | 1 |
B | 1 |
C | 1 |
D | 1 |
E | 1 |
F | 1 |
G | 1 |
H | 1 |
I | 1 |
J | 1 |
K | 1 |
L | 1 |
M | 1 |
N | 1 |
O | 1 |
P | 1 |
Q | 1 |
R | 1 |
S | 1 |
T | 0 |
U | 0 |
V | 0 |
W | 0 |
X | 0 |
Y | 0 |
Z | 1 |
Expected result would be "1" for all rows. Isn't it?
Need
Need was to restrict primary key values of parameter table. Values should follow java enum rules. CHECK constraint and REGEXP_LIKE() function seems good and simple way to implement it.
Example below demonstrates that unexpected error is reported if NLS_SORT=ESTONIAN:
SQL>
CREATE
TABLE
PARAMETER_TABLE (
CODE VARCHAR2(50 CHAR) CONSTRAINT CODE_CK CHECK(REGEXP_LIKE(CODE,'^[A-Z][A-Z0-9_]{0,}$'))
, PRIMARY KEY (CODE)
);
SQL> ALTER SESSION SET NLS_SORT=ESTONIAN;
SQL> INSERT INTO PARAMETER_TABLE VALUES ('S');
1 row created.
SQL> INSERT INTO PARAMETER_TABLE VALUES ('T');
INSERT INTO PARAMETER_TABLE VALUES ('T')
*
ERROR at line 1:
ORA-02290: check constraint (TEST.CODE_CK) violated
Explanation turns out to be simple: REGEXP_LIKE() behavior depends on NLS_SORT value!
Check this example:
CODE VARCHAR2(50 CHAR) CONSTRAINT CODE_CK CHECK(REGEXP_LIKE(CODE,'^[A-Z][A-Z0-9_]{0,}$'))
, PRIMARY KEY (CODE)
);
SQL> ALTER SESSION SET NLS_SORT=ESTONIAN;
SQL> INSERT INTO PARAMETER_TABLE VALUES ('S');
1 row created.
SQL> INSERT INTO PARAMETER_TABLE VALUES ('T');
INSERT INTO PARAMETER_TABLE VALUES ('T')
*
ERROR at line 1:
ORA-02290: check constraint (TEST.CODE_CK) violated
Explanation
Explanation turns out to be simple: REGEXP_LIKE() behavior depends on NLS_SORT value!
Check this example:
ALTER SESSION SET NLS_SORT=ESTONIAN;
SELECT
MY_CHAR,
CASE
WHEN
REGEXP_LIKE(MY_CHAR,'^[A-Z][a-zA-Z0-9_]{0,}$')
THEN
1
ELSE
0
END
AS
REGEXP_RESULT
FROM
(
SELECT
SUBSTR(MY_CHAR,
LEVEL,
1)
MY_CHAR
FROM
(
SELECT
'ABCDEFGHIJKLMNOPQRSŠZŽTUVWÕÄÖÜXY'
AS
MY_CHAR FROM
DUAL
)
CONNECT
BY
LEVEL
<=
LENGTH(
MY_CHAR)
)
ORDER
BY
1;
MY_CHAR | REGEXP_RESULT |
A | 1 |
B | 1 |
C | 1 |
D | 1 |
E | 1 |
F | 1 |
G | 1 |
H | 1 |
I | 1 |
J | 1 |
K | 1 |
L | 1 |
M | 1 |
N | 1 |
O | 1 |
P | 1 |
Q | 1 |
R | 1 |
S | 1 |
Š | 1 |
Z | 1 |
Ž | 0 |
T | 0 |
U | 0 |
V | 0 |
W | 0 |
Õ | 0 |
Ä | 0 |
Ö | 0 |
Ü | 0 |
X | 0 |
Y | 0 |
Z is not last character in Estonian alphabet. Letter T is after Z. Therefore letter T is not covered by "regexp [A-Z]" expression.
Regexp_like() covers whole Estonian alphabet if you use "regexp [A-Y]" expression instead (only if NLS_SORT=ESTONIAN)!
PS
It seems that linux regexp (and sort) works also like regexp_like() in database. Check this example:
# LANG=en_US.UTF-8
# echo -e "S\nT\nX\nY\nZ" | grep -e "^[A-Z]\$"
S
T
X
Y
Z
# LANG=et_EE.UTF-8
# echo -e "S\nT\nX\nY\nZ" | grep -e "^[A-Z]\$"
S
Z
References
Bug 21025097 - REGEXP_LIKE FUNCTION DEPENDS ON SESSION VALUE FOR NLS_SORT
Registered in May 2015
Bug 12429872 - REGEXP_LIKE FUNCTION TAKING WRONG NLS_SORT DURING CONSTRAINT CREATION
Bug 12429872 - REGEXP_LIKE FUNCTION TAKING WRONG NLS_SORT DURING CONSTRAINT CREATION
Registered in May 2011
-- end --