ORA-01489 or ORA-22835 when using LISTAGG

There are various ways to aggregate data from a number of rows into a single row. Since Oracle 11gR2, the function LISTAGG has been introuduced and it’s quite handy to use it for this purpose.

LISTAGG orders data within each group specified in the ORDER BY clause and then concatenates the values of the measure column.

  • As a single-set aggregate function, LISTAGG operates on all rows and returns a single output row.
  • As a group-set aggregate, the function operates on and returns an output row for each group defined by the GROUP BY clause.
  • As an analytic function, LISTAGG partitions the query result set into groups based on one or more expression in the query_partition_clause.

If the measure column is of type RAW, then the return data type is RAW. Otherwise, the return data type is VARCHAR2. Staring with Oracle 12c, the maximum length of the return data type depends on the value of the MAX_STRING_SIZE initialization parameter.

If MAX_STRING_SIZE = EXTENDED, then the maximum length is 32767 bytes for the VARCHAR2 and RAW data types. If MAX_STRING_SIZE = STANDARD, then the maximum length is 4000 bytes for the VARCHAR2 data type and 2000 bytes for the RAW data type. A final delimiter is not included when determining if the return value fits in the return data type.

With this limitation and default value of MAX_STRING_SIZE, you might see the error ORA-01489 or ORA-22835 when aggregating varchar2 or clob respectively because the result length is more than 4000 characters.

Below are three different ways to workaround this:

  1. Change the value parameter from STANDARD to EXTENDED. This method needs to put the database in the upgrade mode and run utl32k.sql after the change. So downtime is needed. Another thing is that there is still a limit of 32767 bytes when using LISTAGG.
  2. Write your own aggregation function which also pushs the limit to 32767 bytes as mentioned here. The function uses ODCIAggregate interface. That’s because in PL/SQL, Oracle allows varchar2 up to 32767 bytes which is much larger than 4000 bytes for varchar2 in SQL. (see the following table from Oracle database PL/SQL language reference)
Data TypeMaximum Size in PL/SQLMaximum Size in SQL
CHARFoot 132,767 bytes2,000 bytes
NCHARFoot 132,767 bytes2,000 bytes
RAWFoot 132,767 bytes2,000 bytesFoot 2
VARCHAR2Foot 132,767 bytes4,000 bytesFoot 2
NVARCHAR2Foot 132,767 bytes4,000 bytesFoot 2
LONGFoot 332,760 bytes2 gigabytes (GB) – 1
LONG RAWFoot 332,760 bytes2 GB
BLOB128 terabytes (TB)(4 GB – 1) * database_block_size
CLOB128 TB(4 GB – 1) * database_block_size
NCLOB128 TB(4 GB – 1) * database_block_size
Table 3-1 Data Types with Different Maximum Sizes in PL/SQL and SQL
  • Footnote 1 When specifying the maximum size of a value of this data type in PL/SQL, use an integer literal (not a constant or variable) whose value is in the range from 1 through 32,767.
  • Footnote 2 To eliminate this size difference, follow the instructions in Oracle Database SQL Language Reference.
  • Footnote 3 Supported only for backward compatibility with existing applications.

CREATE OR REPLACE TYPE t_string_agg AS OBJECT
(
  g_string  VARCHAR2(32767),

  STATIC FUNCTION ODCIAggregateInitialize(sctx  IN OUT  t_string_agg)
    RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateIterate(self   IN OUT  t_string_agg,
                                       value  IN      VARCHAR2 )
     RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateTerminate(self         IN   t_string_agg,
                                         returnValue  OUT  VARCHAR2,
                                         flags        IN   NUMBER)
    RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateMerge(self  IN OUT  t_string_agg,
                                     ctx2  IN      t_string_agg)
    RETURN NUMBER
);

CREATE OR REPLACE TYPE BODY t_string_agg IS
  STATIC FUNCTION ODCIAggregateInitialize(sctx  IN OUT  t_string_agg)
    RETURN NUMBER IS
  BEGIN
    sctx := t_string_agg(NULL);
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateIterate(self   IN OUT  t_string_agg,
                                       value  IN      VARCHAR2 )
    RETURN NUMBER IS
  BEGIN
    SELF.g_string := self.g_string || ',' || value;
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateTerminate(self         IN   t_string_agg,
                                         returnValue  OUT  VARCHAR2,
                                         flags        IN   NUMBER)
    RETURN NUMBER IS
  BEGIN

    returnValue := SUBSTR(SELF.g_string, 2);
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateMerge(self  IN OUT  t_string_agg,
                                     ctx2  IN      t_string_agg)
    RETURN NUMBER IS
  BEGIN
    SELF.g_string := SELF.g_string || ctx2.g_string;
    RETURN ODCIConst.Success;
  END;
END;

CREATE OR REPLACE FUNCTION string_agg (p_input VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE AGGREGATE USING t_string_agg;
  1. If you need to aggregate data more than 32767 bytes, a better way is to use XMLAgg function. First use XMLElement to assemble an XML for each row, then use XMLAgg to aggregate rows into one row. Both XMLAgg and XMLElement will return an instance of XMLType which is stored as “SYS.XMLTYPE STORAGE BINARY” — basically using SecureFiles. (see XML DB Developer’s Guide for XMLType Data Type and Storage Models). XMLType has a bunch of subprograms which can be used. Here we use EXTRACT and GETCLOBVAL to extract contents and convert XMLType to clob.

With 3 different ways, for a table named employee which contains two columns, one is department number and another is employee name. Each row is for an employee. You can aggregate employee names seperated by “,” into one row by each department number.

SELECT dept_no, listagg(emp_name, ',') within group (order by emp_name) FROM employee GROUP BY dept_no;

SELECT /*+ PARALLEL(2) */ dept_no, string_agg(emp_name) AS emp_name_list FROM employee GROUP BY dept_no;

SELECT dept_no, rtrim(xmlagg(xmlelement(e, emp_name||',')).extract('//text()').getClobVal(), ',') as emp_name_list FROM employee GROUP BY dept_no;

Note:

  1. When considering to set the parameter MAX_STRING_SIZE = EXTENDED, there will be other consequences such as when indexing a column with extended data type (e.g. a VARCHAR2 column with declared size larger than 4000 bytes)
  2. With listagg, Oracle 19c has an overflow clause (ON OVERFLOW TRUNCATE) which can truncate value.
  3. With OVER clause of listagg, it can be used as an Analytic function. And listagg can also remove duplicates easily with distinct keyword. All of them make it more favourable than other methods if length is not a concern.

References:

  1. Database Reference 19c A.1 Datatype Limits
  2. PL/SQL Packages and Types Reference 19c -XMLTYPE
  3. XML Developer’s Guide 19c
  4. SQL Language Reference 19c – listagg
  5. String Aggregation Techniques
  6. LISTAGG DISTINCT in Oracle Database 19c
  7. Data Cartridge Developer’s Guide 19c — User-Defined Aggregate Functions
  8. How To Distinguish XMLTYPE (Doc ID 2743690.1)
  9. WMSYS.WM_CONCAT Should Not Be Used For Customer Applications, It Is An Internal Function (Doc ID 1336219.1)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s