Friday, 30 October 2015

Removing Duplicate Items From a String

We can generalize the above example to afterseparator(item)(separator\1)+beforeseparator, whereafterseparator and beforeseparator are zero-length. So if you want to remove consecutive duplicates from a comma-delimited list, you could use (?<=,|^)([^,]*)(,\1)+(?=,|$).
The positive lookbehind (?<=,|^) forces the regex engine to start matching at the start of the string or after a comma. ([^,]*) captures the item. (,\1)+ matches consecutive duplicate items. Finally, the positive lookahead(?=,|$) checks if the duplicate items are complete items by checking for a comma or the end of the string.


