u8_validate man page on SunOS

Man page or keyword search:  
man Server   20652 pages
apropos Keyword Search (all sections)
Output format
SunOS logo
[printable version]

u8_validate(3C)		 Standard C Library Functions	       u8_validate(3C)

NAME
       u8_validate - validate UTF-8 characters and calculate the byte length

SYNOPSIS
       #include <sys/u8_textprep.h>

       int u8_validate(char *u8str, size_t n, char **list, int flag,
	    int *errnum);

PARAMETERS
       u8str	 The UTF-8 string to be validated.

       n	 The maximum number of bytes in u8str that can be examined and
		 validated.

       list	 A list of null-terminated character  strings  in  UTF-8  that
		 must  be  additionally checked against as invalid characters.
		 The last string in list must be null to indicate there is  no
		 further string.

       flag	 Possible  validation  options constructed by a bitwise-inclu‐
		 sive-OR of the following values:

		 U8_VALIDATE_ENTIRE

		     By default, u8_validate() looks at the first character or
		     up	 to n bytes, whichever is smaller in terms of the num‐
		     ber of bytes to be consumed, and returns with the result.

		     When this option is used, u8_validate() will check up  to
		     n	bytes  from  u8str  and possibly more than a character
		     before returning the result.

		 U8_VALIDATE_CHECK_ADDITIONAL

		     By default, u8_validate() does not use list supplied.

		     When this option is supplied with	a  list	 of  character
		     strings,	u8_validate()	additionally  validates	 u8str
		     against the character  strings  supplied  with  list  and
		     returns EBADF in errnum if u8str has any one of the char‐
		     acter strings in list.

		 U8_VALIDATE_UCS2_RANGE

		     By default, u8_validate() uses the entire Unicode	coding
		     space of U+0000 to U+10FFFF.

		     When  this	 option is specified, the valid Unicode coding
		     space is smaller to U+0000 to U+FFFF.

       errnum	 An error occurred during validation.	The  following	values
		 are supported:

		 EBADF	   Validation failed because list-specified characters
			   were found in the string pointed to by u8str.

		 EILSEQ	   Validation failed because an illegal byte was found
			   in the string pointed to by	u8str.

		 EINVAL	   Validation  failed  because	an incomplete byte was
			   found in the string pointed to by  u8str.

		 ERANGE	   Validation  failed  because	character  bytes  were
			   encountered	that are outside the range of the Uni‐
			   code coding space.

DESCRIPTION
       The u8_validate() function validates u8str in UTF-8 and determines  the
       number of bytes constituting the character(s) pointed to by u8str.

RETURN VALUES
       If u8str is a null pointer, u8_validate() returns 0. Otherwise, u8_val‐
       idate() returns either the number of bytes that constitute the  charac‐
       ters if the next n or fewer bytes form valid characters, or -1 if there
       is an validation failure, in which case it may set errnum  to  indicate
       the error.

EXAMPLES
       Example 1 Determine the length of the first UTF-8 character.

	 #include <sys/u8_textprep.h>

	 char u8[MAXPATHLEN];
	 int errnum;
	 .
	 .
	 .
	 len = u8_validate(u8, 4, (char **)NULL, 0, &errnum);
	 if (len == -1) {
	     switch (errnum) {
		 case EILSEQ:
		 case EINVAL:
		     return (MYFS4_ERR_INVAL);
		 case EBADF:
		     return (MYFS4_ERR_BADNAME);
		 case ERANGE:
		     return (MYFS4_ERR_BADCHAR);
		 default:
		     return (-10);
	     }
	 }

       Example	2  Check  if  there  are  any invalid characters in the entire
       string.

	 #include <sys/u8_textprep.h>

	 char u8[MAXPATHLEN];
	 int n;
	 int errnum;
	 .
	 .
	 .
	 n = strlen(u8);
	 len = u8_validate(u8, n, (char **)NULL, U8_VALIDATE_ENTIRE, &errnum);
	 if (len == -1) {
	     switch (errnum) {
		 case EILSEQ:
		 case EINVAL:
		     return (MYFS4_ERR_INVAL);
		 case EBADF:
		     return (MYFS4_ERR_BADNAME);
		 case ERANGE:
		     return (MYFS4_ERR_BADCHAR);
		 default:
		     return (-10);
	     }
	 }

       Example 3 Check if there is any invalid character, including prohibited
       characters, in the entire string.

	 #include <sys/u8_textprep.h>

	 char u8[MAXPATHLEN];
	 int n;
	 int errnum;
	 char *prohibited[4] = {
	     ".", "..", "\\", NULL
	 };
	 .
	 .
	 .
	 n = strlen(u8);
	 len = u8_validate(u8, n, prohibited,
	     (U8_VALIDATE_ENTIRE|U8_VALIDATE_CHECK_ADDITIONAL), &errnum);
	 if (len == -1) {
	     switch (errnum) {
		 case EILSEQ:
		 case EINVAL:
		     return (MYFS4_ERR_INVAL);
		 case EBADF:
		     return (MYFS4_ERR_BADNAME);
		 case ERANGE:
		     return (MYFS4_ERR_BADCHAR);
		 default:
		     return (-10);
	     }
	 }

ATTRIBUTES
       See attributes(5) for descriptions of the following attributes:

       ┌─────────────────────────────┬─────────────────────────────┐
       │      ATTRIBUTE TYPE	     │	    ATTRIBUTE VALUE	   │
       ├─────────────────────────────┼─────────────────────────────┤
       │Interface Stability	     │Committed			   │
       ├─────────────────────────────┼─────────────────────────────┤
       │MT-Level		     │MT-Safe			   │
       └─────────────────────────────┴─────────────────────────────┘

SEE ALSO
       u8_strcmp(3C),	u8_textprep_str(3C),   attributes(5),	u8_strcmp(9F),
       u8_textprep_str(9F), u8_validate(9F)

       The Unicode Standard (http://www.unicode.org)

SunOS 5.10			  18 Sep 2007		       u8_validate(3C)
[top]

List of man pages available for SunOS

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net