Skip to content

Instantly share code, notes, and snippets.

@richardpascual
Created August 8, 2014 08:19
Show Gist options
  • Save richardpascual/7595a3f92bdb3c09168e to your computer and use it in GitHub Desktop.
Save richardpascual/7595a3f92bdb3c09168e to your computer and use it in GitHub Desktop.
How to remove unwanted non-alphanumeric characters from an input string
-- my-regexp-workshop (Oracle database table)
-- This is how to make the supporing schema tables. The autosequenced column value/trigger combination is optional.
CREATE SEQUENCE "MY_REGEXP_WORKSHOP_SEQ" MINVALUE 1 MAXVALUE 9999999999999999999999999999 INCREMENT BY 1 START WITH 21 CACHE 20 NOORDER NOCYCLE
/
CREATE TABLE "MY_REGEXP_WORKSHOP"
( "RECORD_ID" NUMBER NOT NULL ENABLE,
"CUSTOMER_NAME" VARCHAR2(50),
"INPUT_VALUE" VARCHAR2(50),
CONSTRAINT "MY_REGEXP_WORKSHOP_PK" PRIMARY KEY ("RECORD_ID")
USING INDEX ENABLE
)
/
CREATE OR REPLACE TRIGGER "BI_MY_REGEXP_WORKSHOP"
before insert on "MY_REGEXP_WORKSHOP"
for each row
begin
if :NEW."RECORD_ID" is null then
select "MY_REGEXP_WORKSHOP_SEQ".nextval into :NEW."RECORD_ID" from sys.dual;
end if;
end;
/
ALTER TRIGGER "BI_MY_REGEXP_WORKSHOP" ENABLE
/
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 3 in line 1.
-- Some sample test data intended to test out the regexp functionality.
"RECORD_ID","CUSTOMER_NAME","INPUT_VALUE"
"1","HA=?<>::RRI**+S",""
"2","GO(RDON)",""
"3","<SAMUEL>",""
"5","M%OR?RIS^^",""
"6","[Q]&UIX-OT//E@",""
"7","SA$B,RINA!!!",""
"4","A#SHTON*&^%$",""
-- remove-str: using Oracle PL/SQL Regular Expressions
-- This used to be hard with older versions of Oracle RDBMS. Now REGEXP recipes seem to simplify it all.
-- model implementation borrowed from the following source:
-- https://community.oracle.com/message/2231311#2231311
select regexp_replace('This is a test $%&^*&*'||chr(10)||'*"ú%bsfd.', '( *[[:punct:]])', '') from dual
-- applying the concept to a full table: MY_REGEXP_WORKSHOP:
select record_id, customer_name,
regexp_replace(customer_name, '( *[[:punct:]])', '') as converted_string
from my_regexp_workshop
order by record_id asc;
@richardpascual
Copy link
Author

Data-Scrubbing Text Inputs with Oracle: ORACLE-11g (and later)

remove-str Used to be the name of the custom function I developed to strip off unwanted symbols, or non-alphanumeric characters from data values processed through SQL or PL/SQL driven processes.

Using Oracle PL/SQL Regular Expressions

This used to be hard with older versions of the Oracle RDBMS. Now REGEXP recipes seem to simplify it all. Oracle official references on the REGEXP function reveal lots of great Oracle Derived and more standard POSIX Flavored regular expression recipes. Find or build the ones that work the best for you and share!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment