Error : 'utf-8' codec can't decode byte 0xb0 in position 14: invalid start byte

I'm a beginner at Python, and I would like to read multiple csv file and when i encode them with encoding = "ISO-8859-1",I get this kind of characters in my csv file : "DÂ°faut". So I tried to encode in utf-8, I get this error : 'utf-8' codec can't decode byte 0xb0 in position 14: invalid start byte'. Can someone help me please ? Thank you !

1 Answer

If you decode with utf-8 you should also encode with utf-8. Depending on the unicode character you want to display (basically everything except for basic latin letters, digits and the usual symbols) utf-8 needs multiple bytes to store it. Since the file is read byte by byte you need to know if the next character needs more than a byte. This is indicated by the most significant bit of the byte. 0xb0 translates to 1011 0000 in binary and as you can see, the first bit is a 1 and that tells the utf-8 decoder that it needs more bytes for the character to be read. Since you encoded with iso-8859-1 the following byte will be part of the current character and encoding fails. If you want to encode the degree symbol (°), it would be encoded as 0xC2 0xB0.

In any case: Always encode with the same encoding as you want to decode. If you need characters outside the code page, use utf-8. In general using any of the utf encodings is a good advice.

Star Pop News

Error : 'utf-8' codec can't decode byte 0xb0 in position 14: invalid start byte

1 Answer

Your Answer

Sign up or log in

Post as a guest

You Might Also Like

Can I redistribute my skill points?

Why was karma important?

How does 4 earthquake spells destroying any wall in clash of clans work?

What is the best arrangement of gem dragons in their habitats?