Understanding the binary transformation of strings and integers

In summary: For fun, I have decided to implement a simple XOR encryption algorithm. The first step is to convert messages into bytes to perform XOR operation on each bit. The problem has started here. For instance, I want to encrypt this message.I hiked 24 miles.Now I need to turn this text into binary. It seems that there are two different ways to do it for '24';a) Take 2 and 4 as separate and as integers, so this means>>> bin(2)'0b10'>>> bin(4)'0b100'or[00000010, 00000100]b) take 2 and 4 as strings, so by doing something
  • #1
Arman777
Insights Author
Gold Member
2,168
193
For fun, I have decided to implement a simple XOR encryption algorithm. The first step is to convert messages into bytes to perform XOR operation on each bit. The problem has started here. For instance, I want to encrypt this message.

Code:
I hiked 24 miles.

Now I need to turn this text into binary. It seems that there are two different ways to do it for '24';

a) Take 2 and 4 as separate and as integers, so this means
Code:
>>> bin(2)
'0b10'
>>> bin(4)
'0b100'
or
[00000010, 00000100]

b) take 2 and 4 as strings, so by doing something like this

Code:
>>> bin(ord('2'))
'0b110010'
>>> bin(ord('4'))
'0b110100'
or
[00110010, 00110100]

Does this makes a difference in the perspective of XOR encryption (or in general) ? What is the correct approach ?
 
Technology news on Phys.org
  • #2
Arman777 said:
Does this makes a difference in the perspective of XOR encryption (or in general) ? What is the correct approach ?
'I hiked 24 miles.' is a string of characters or bytes. You have no idea what format should be used for a numeric field, so encryption must encrypt the character string, independent of the number present.
 
  • Like
Likes Arman777 and pasmith
  • #3
Arman777 said:
For fun, I have decided to implement a simple XOR encryption algorithm. The first step is to convert messages into bytes to perform XOR operation on each bit. The problem has started here. For instance, I want to encrypt this message.

Code:
I hiked 24 miles.

Now I need to turn this text into binary. It seems that there are two different ways to do it for '24';

a) Take 2 and 4 as separate and as integers, so this means
Code:
>>> bin(2)
'0b10'
>>> bin(4)
'0b100'
or
[00000010, 00000100]

b) take 2 and 4 as strings, so by doing something like this

Does this makes a difference in the perspective of XOR encryption (or in general) ? What is the correct approach ?

Note that 0x04 is the "end of transmission" marker, which UNIX uses to indicate the end of a file, and 0x02 is the "start of text" marker, which doesn't have any meaning to UNIX. So your decrypted message would result in a file containing the string b'I hiked \x02' if stored on a UNIX system. In fact all bytes in the range 0x00 to 0x09 and beyond represent control characters which may have a special meaning for whatever application is going to interpret your decrypted data, if not to the operating system itself. So '2' -> 0x02 etc. is a very bad idea.

@Baluncore's approach is the only viable approach, and ensures that your decrypted data will be byte for byte identical with the input.
 
  • Like
Likes Arman777
  • #4
then I will do that. Thanks for the answers.
 
  • #5
What seems to be missing here is an understanding of the difference between numeric digit characters, such as '2' or '4', and the numerals 2 or 4.

The '2' and '4' characters are stored as ASCII values of 50 and 52 respectively. The numbers 2 and 4 are stored as their own values.
 
  • #6
In most cases of numbers embedded in text, the leading zeros can be dropped, but imagine trying to correctly restore data that had critical leading or trailing zeros in a part number.

An encryption - decryption algorithm should always regenerate the input exactly. The message should never be arbitrarily abbreviated or compressed.
 

FAQ: Understanding the binary transformation of strings and integers

What is the difference between a string and an integer?

A string is a sequence of characters, such as letters, numbers, and symbols, that is used to represent text. An integer, on the other hand, is a numerical value that is used for counting or calculations. The main difference between the two is that a string is made up of characters while an integer is made up of numerical values.

How are strings and integers stored in binary code?

In binary code, both strings and integers are represented using a series of 0s and 1s. Each character in a string is assigned a specific binary code, while integers are stored as binary numbers with a specific number of bits. For example, the string "hello" would be stored as 01101000 01100101 01101100 01101100 01101111, and the integer 6 would be stored as 00000110.

Why is it important to understand the binary transformation of strings and integers?

Understanding the binary transformation of strings and integers is important because it is the fundamental way in which computers store and process data. Having a solid understanding of how strings and integers are represented in binary code can help with troubleshooting and optimizing code, as well as developing more efficient algorithms.

Can a string be converted into an integer?

Yes, a string can be converted into an integer by using a function or method that performs the conversion. However, this conversion is not always possible if the string contains non-numerical characters. In this case, an error may occur or the conversion may result in an unexpected value.

How does the binary transformation of strings and integers impact computer performance?

The binary transformation of strings and integers can have a significant impact on computer performance. Since computers process data in binary code, the more complex the binary transformation of strings and integers, the more processing power and memory is required. This is why it is important to optimize code and minimize the use of unnecessary transformations to improve performance.

Similar threads

Replies
1
Views
3K
Replies
7
Views
2K
Replies
13
Views
2K
Replies
1
Views
2K
Replies
1
Views
3K
Replies
7
Views
3K
Replies
9
Views
3K
Back
Top