15.15 C字符串转换为Python字符串
最后更新于:2022-04-01 15:41:24
## 问题
You want to convert strings from C to Python bytes or a string object.
## 解决方案
For C strings represented as a pair char [*](#), int, you must decide whether or not youwant the string presented as a raw byte string or as a Unicode string. Byte objects canbe built using Py_BuildValue() as follows:
char _s; /_ Pointer to C string data _/int len; /_ Length of data [*](#)/
/* Make a bytes object [*](#)/PyObject [*](#)obj = Py_BuildValue(“y#”, s, len);
If you want to create a Unicode string and you know that s points to data encoded asUTF-8, you can use the following:
PyObject [*](#)obj = Py_BuildValue(“s#”, s, len);
If s is encoded in some other known encoding, you can make a string using PyUnicode_Decode() as follows:
PyObject [*](#)obj = PyUnicode_Decode(s, len, “encoding”, “errors”);
/* Examples /*obj = PyUnicode_Decode(s, len, “latin-1”, “strict”);obj = PyUnicode_Decode(s, len, “ascii”, “ignore”);
If you happen to have a wide string represented as a wchar_t [*](#), len pair, there are afew options. First, you could use Py_BuildValue() as follows:
wchar_t _w; /_ Wide character string _/int len; /_ Length [*](#)/
PyObject [*](#)obj = Py_BuildValue(“u#”, w, len);
Alternatively, you can use PyUnicode_FromWideChar():
PyObject [*](#)obj = PyUnicode_FromWideChar(w, len);
For wide character strings, no interpretation is made of the character data—it is assumedto be raw Unicode code points which are directly converted to Python.
## 讨论
Conversion of strings from C to Python follow the same principles as I/O. Namely, thedata from C must be explicitly decoded into a string according to some codec. Commonencodings include ASCII, Latin-1, and UTF-8. If you’re not entirely sure of the encodingor the data is binary, you’re probably best off encoding the string as bytes instead.When making an object, Python always copies the string data you provide. If necessary,it’s up to you to release the C string afterward (if required). Also, for better reliability,you should try to create strings using both a pointer and a size rather than relying onNULL-terminated data.