Why python has different types of bytes

44
October 22, 2019, at 9:20 PM

I have two variables, one is b_d, the other is b_test_d.

When I type b_d in the console, it shows:

b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07@\x00\x00\x00\x00\x00\x00\xf0?'

when I type b_test_d in the console, it shows:

b'[-2.1997713216,-1.4249271187,-1.1076795391,1.5224958034,-0.1709796203,0.3663875698,0.14846441,-0.7415930061,-1.7602231949,0.126605689,0.6010934792,-0.466415358,1.5675525816,1.00836295,1.4332792992,0.6113384254,-1.8008540571,-0.9443408896,1.0943670356,-1.0114642686,1.443892627,-0.2709427287,0.2990462512,0.4650133591,0.2560791327,0.2257600462,-2.4077429827,-0.0509983213,1.0062187148,0.4315075795,-0.6116110033,0.3495131413,-0.3249903375,0.3962305931,-0.1985757285,1.165792433,-1.1171953063,-0.1732557874,-0.3791600654,-0.2860519953,0.7872658859,0.217728374,-0.4715179983,-0.4539613811,-0.396353657,1.2326862425,-1.3548659354,1.6476230786,0.6312713442,-0.735444661,-0.6853447369,-0.8480631975,0.9538606574,0.6653542368,-0.2833696021,0.7281604648,-0.2843872095,0.1461980484,-2.3511731773,-0.3118047948,-1.6938613893,-0.0359659687,-0.5162134311,-2.2026641552,-0.7294895084,0.7493073213,0.1034096968,0.6439803068,-0.2596155272,0.5851323455,1.0173285542,-0.7370464113,1.0442954406,-0.5363832595,0.0117795359,0.2225617514,0.067571974,-0.9154681906,-0.293808596,1.3717113798,0.4919516922,-0.3254944005,1.6203744532,-0.1810222279,-0.6111596457,1.344064259,-0.4596893179,-0.2356197144,0.4529942046,1.6244603294,0.1849995925,0.6223061217,-0.0340662398,0.8365900535,-0.6804201929,0.0149665385,0.4132453788,0.7971962667,-1.9391525531,0.1440486871,-0.7103617816,0.9026539637,0.6665798363,-1.5885073458,1.4084493329,-1.397040825,1.6215697667,1.7057148522,0.3802647045,-0.4239271483,1.4773614536,1.6841461329,0.1166845529,-0.3268795898,-0.9612751672,0.4062399443,0.357209662,-0.2977362702,-0.3988147401,-0.1174652196,0.3350589818,-1.8800423584,0.0124169787,1.0015110265,0.789541751,-0.2710408983,1.4987300181,-1.1726824468,-0.355322591,0.6567978423,0.8319110558,0.8258835069,-1.1567887763,1.9568551122,1.5148655075,1.0589021915,-0.4388232953,-0.7451680183,-2.1897621693,0.4502135234,-1.9583089063,0.1358789518,-1.7585860897,0.452259777,0.7406800349,-1.3578980418,1.108740204,-1.1986272667,-1.0273598206,-1.8165822264,1.0853600894,-0.273943514,0.8589890805,1.3639094329,-0.6121993589,-0.0587067992,0.0798457584,1.0992814648,-1.0455733611,1.4780003064,0.5047157705,0.1565451605,0.9656886956,-0.5998330255,0.4846727299,0.8790524818,1.0288893846,-2.0842447397,0.4074607421,2.1523241756,-1.1268047125,-0.6016001524,-1.3302141561,1.1869516954,1.0988060125,0.7405900405,1.1813110811,0.8685330644,2.0927140519,-1.7171952009,0.9231993147,0.320874115,0.7465845079,-0.1034484959,-0.4776822499,0.436218328,-0.4083564542,0.4835567895,1.0733230373,-0.858658902,-0.4493571034,0.4506418221,1.6696649735,-0.9189799982,-1.1690356499,-1.0689397924,0.3174297583,1.0403701444,0.5440082812,-0.1128248996]'

Both of them are bytes type, but I can use numpy.frombuffer to read the b_d, but not the b_test_d. And they look very different. Why do I have these two types of bytes?

Thank you.

Answer 1

[A]nyone can point out how to use Json marshall to convert the byte to the same type of bytes as the first one?

This isn't the right question, but I think I know what you're asking. You say you're getting the 2nd array via JSON marshalling, but that it's also not under your control:

it was obtained by json marshal (convert a received float array to byte array, and then convert the result to base64 string, which is done by someone else)

That's fine though, you just have to do a few steps of processing to get to a state equivalent to the first set of bytes.

First, some context to what's going on. You've already seen that numpy can understand your first set of bytes.

>>> numpy.frombuffer(data)
[1.21  2.963 1.   ]

Based on its output, it looks like numpy is interpreting your data as 3 doubles, with 8 bytes each (24 bytes total)...

>>> data = b'\\\x8f\xc2\xf5(\\\xf3?Nb\x10X9\xb4\x07@\x00\x00\x00\x00\x00\x00\xf0?'
>>> len(data)
24

...which the struct module can also interpret.

# Separate into 3 doubles
x, y, z = data[:8], data[8:16], data[16:]
print([struct.unpack('d', i) for i in (x, y, z)])
[(1.21,), (2.963,), (1.0,)

There's actually (at least) 2 ways you can get a numpy array out of this.

Short way

1. Convert to string

# Original JSON data (snipped)
junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]'
# Decode from bytes to a string (defaults to utf-8), then
# trim off the brackets (first and last characters in the string)
as_str = junk.decode()[1:-1]

2. Use numpy.fromstring

numpy.fromstring(as_str, dtype=float, sep=',')
# Produces:
array([-2.19977132, -1.42492712, -1.10767954,  1.5224958 , -0.17097962,
        0.36638757,  0.14846441, -0.74159301, -1.76022319,  0.12660569,
        0.60109348, -0.46641536,  1.56755258,  1.00836295,  1.4332793 ,
        0.61133843, -1.80085406, -0.94434089,  1.09436704, -1.01146427,
        1.44389263, -0.27094273,  0.29904625,  0.46501336,  0.25607913,
        0.22576005, -2.40774298, -0.05099832,  1.00621871,  0.43150758,
        ... ])

Long way

Note: I found the fromstring method after writing this part up, figured I'd leave it here to at least help explain the byte differences.

1. Convert the JSON data into an array of numeric values.

# Original JSON data (snipped)
junk = b'[-2.1997713216,-1.4249271187,-1.1076795391,...]'
# Decode from bytes to a string - defaults to utf-8
junk = junk.decode()
# Trim off the brackets - First and last characters in the string
junk = junk[1:-1]
# Separate into values
junk = junk.split(',')
# Convert to numerical values
doubles = [float(val) for val in junk]
# Or, as a one-liner
doubles = [float(val) for val in junk.decode()[1:-1].split(',')]
# "doubles" currently holds:
[-2.1997713216,
 -1.4249271187,
 -1.1076795391,
 1.5224958034,
 ...]

2. Use struct to get byte-representations for the doubles

import struct
as_bytes = [struct.pack('d', val) for val in doubles]
# "as_bytes" currently holds:
[b'\x08\x9b\xe7\xb4!\x99\x01\xc0',
 b'\x0b\x00\xe0`\x80\xcc\xf6\xbf',
 b'+ ..\x0e\xb9\xf1\xbf',
 b'hg>\x8f$\\\xf8?',
 ...]

3. Join all the double values (as bytes) into a single byte-string, then submit to numpy

new_data = b''.join(as_bytes)
numpy.frombuffer(new_data)
# Produces:
array([-2.19977132, -1.42492712, -1.10767954,  1.5224958 , -0.17097962,
        0.36638757,  0.14846441, -0.74159301, -1.76022319,  0.12660569,
        0.60109348, -0.46641536,  1.56755258,  1.00836295,  1.4332793 ,
        0.61133843, -1.80085406, -0.94434089,  1.09436704, -1.01146427,
        1.44389263, -0.27094273,  0.29904625,  0.46501336,  0.25607913,
        0.22576005, -2.40774298, -0.05099832,  1.00621871,  0.43150758,
        ... ])

READ ALSO
Why does my tkinter object keep changing shape?

Why does my tkinter object keep changing shape?

I have created an image on a canvas in tkinter that responds to a button eventAnd, the object is created on position x and position y where that event took place

31
Order coplanar points for drawing them

Order coplanar points for drawing them

I have an algorithm wich generates from three to six points of intersection between a plane and the edges of a cube which contains that planeFor drawing the plane (which gets drawn a polygon) OpenGL needs the vertex to be ordered as in the following picture:

26
Bytes input is INCORRECT (can't convert to JSON) in Python

Bytes input is INCORRECT (can't convert to JSON) in Python

Basically, I have this API end point that will be called if you make a POST request to itThe problem is for some reason, I can't convert the bytes to JSON so I can access the data

51