Decompiling Program: How to Obtain Source Code?

  • Thread starter liokaiser
  • Start date
  • Tags
    Program
In summary, it is not possible to obtain the exact same source code that the original authors wrote from an executable file. Variable names and other information are generally lost during compilation. The only way to potentially view the source code is to use a decompiler, which can generate one of the possible source codes that could have been compiled to the given code. However, this is often against the law and there are no guarantees that the decompiled code will be accurate. Reverse engineering can be used for this purpose, but there are also limitations and potential legal issues involved.
  • #1
liokaiser
4
0
anyone knows how to obtain the source code from the execute file itself?
 
Computer science news on Phys.org
  • #2
It's not possible to obtain the exact same source code that the original authors wrote. Variable names, for example, are generally lost during the compilation. All you can do is use a decompiler to generate one of the possible source codes that could have been compiled to the given code.

- Warren
 
  • #3
If you are working on a bsd based system and want to decompile mach-o files just use otool.
 
  • #4
I once used a program called DeCafe to decoce a Java .class file
 
  • #5
What programming language are you using ?
Why do you want to see the source code ?
What is the main purpose of reverse engineering in software engineering ?
Do you know how to use polynomial code or more figurable to -polynomialize- your code in the software you make ?
How many software products have you written that are used with cryptography ?

Can you clear up those questions so that perhaps I will try to give you some answers then ?
 
  • #6
chroot said:
It's not possible to obtain the exact same source code that the original authors wrote. Variable names, for example, are generally lost during the compilation. All you can do is use a decompiler to generate one of the possible source codes that could have been compiled to the given code.

- Warren
That's not entirely true. There are "underground" programs that can decompile programs to the exact source code (I know of 2 that work for sure). However; these programs are illegal and cannot be obtained. Also, you need to remember, fully decompiling software is against the law. There was a supreme court ruling about this issue awhile back. So your only hope of seeing the unfinished source code of a program, is by using reverse engineering.

As a little disclaimer, so I don't get in trouble, I do not support or assist in any sort of illegal activity.
 
  • #7
Cod,

Please name some of these mythical underground programs. When you compile a program in C, for example, you lose comments, formatting, and often variable and constant names. There is no way to recreate this lost information, even in principle.

- Warren
 
  • #8
chroot said:
Cod,

Please name some of these mythical underground programs. When you compile a program in C, for example, you lose comments, formatting, and often variable and constant names. There is no way to recreate this lost information, even in principle.

- Warren
I'm not going to name any here for obvious reasons. Contact me on AIM one day and I'll link you to a few sites with the programs. And yes, you do lose comments and the format, but you never lose variable and constant names. Sometimes, however, they are "hidden" and harder to come across.
 
  • #9
Cod,

Just for the record -- it's not illegal to talk about or perhaps even to distribute any kind of software. It's also not illegal to use any kind of reverse-engineering tools on software that is not legally protected. If I write my own "Hello World" and use a reverse-engineering tool, I'm not breaking any laws.

It's like a gun -- guns are not illegal, but many of the things one can do with a gun are.

- Warren
 
  • #10
Heh, you are correct. Maybe I didn't phrase my last post so it reflected my thoughts. So I'll retry this...

There are malicious individuals browsing this site daily (I'm sure of it), so I don't want the name of a program to get in the hands of a "script kiddy". Because once they get the name, they'll go get the program and start reverse engineering software that is protected from it (ex. Windows is a popular choice of malicious hackers).

It's nothing against you, but I'm not going to be the reason of someone's illegal doings. Its a moral thing :smile:
 
  • #11
Cod said:
I'm not going to name any here for obvious reasons. Contact me on AIM one day and I'll link you to a few sites with the programs. And yes, you do lose comments and the format, but you never lose variable and constant names. Sometimes, however, they are "hidden" and harder to come across.

This is incorrect, however. If you have a program that has been compiled into machine code, it is impossible to recover the names of anything. I can take a program written in C, rename every single variable and constant, and the compiler will produce the exact same binary file.

Not only that, but you can't even recover the original code without the variable names. Different statements can end up generating the same machine code, making it impossible to determine what the program code was just using the machine code. And optimizing compilers mess it up even more; they may delete parts of the code, rearrange the ordering, inline function calls, etc.

You may be able to decompile a program if it is not compiled into machine code, but instead some kind of byte code that has additional information embedded into it. Or if debugging info has been embedded into the program. But if the program is compiled into straight machine code, then about the best you can hope for is to disassemble it into assembly language (and even that isn't guaranteed to work); and those kinds of tools are hardly secret hacker tools that are hard to find.
 
  • #12
However, I should mention that given a program compiled into C, it is possible to recover some of the code into C statements. It isn't possible to recover variable names, or function names, or preprocessor statements, or a lot of other things.

For example, you can usually decompile a program into functions, since most compilers use similar code as prologues and epilogues for their functions. And a lot of arithmetic statements can be recovered. Even conditionals and loops may be recoverable. This is particularly true for code which has not been optimized; optimizing screws things up, since the code produced by the compiler tends to no longer stick to a certain style for each statement.

Of course, this is a lot of work. A decompiler has to be tuned for every compiler, and even for different versions of the same compiler. And you still have your work cut out for you when it comes to actually figuring out what the code was doing; particularly for programs that are more than just a couple functions. And it gets more difficult as you move to compiled languages other than C. C is about as low-level as you get, so there's less of a difference in semantics between the code generated and the original C code.

If you want more information, go to google and search with "decompiling c programs". That should get you quite a bit of info. Reverse engineering is a very active area of study, but have reasonable expectations of what can be done. Some things are easy. Lots of things are very hard. And lots of things are impossible.
 
  • #13
master_coda,

I was just waiting to hear about these "elite hacker decompilers" that are capable of decompiling Windows into source before making those statements. :smile:

- Warren
 
  • #14
chroot said:
master_coda,

I was just waiting to hear about these "elite hacker decompilers" that are capable of decompiling Windows into source before making those statements. :smile:

- Warren

But these discussions distract me from writing my program that automatically finds and fixes all the bugs in a program. :biggrin:
 
  • #15
Originally posted by master_coda:
But these discussions distract me from writing my program that automatically finds and fixes all the bugs in a program.
Too ambitious. I'll be happy when I finish the one that just identifies infinite loops. :-p
 
  • #16
gnome,

Look into the so-called "Cleanroom Software Engineering" techniques. A good book is Stavely's "Toward Zero Defect Programming." If you learn to use consistent and verifiable structures in your code, you'll be much less likely to introduce bugs of that kind.

- Warren
 
  • #17
chroot said:
gnome,

Look into the so-called "Cleanroom Software Engineering" techniques. A good book is Stavely's "Toward Zero Defect Programming." If you learn to use consistent and verifiable structures in your code, you'll be much less likely to introduce bugs of that kind.

- Warren

I was under the impression that his remark was a tongue-in-cheek reference to solving the halting problem, not an actual complaint that he was having problems with infinite loops in his code...
 
  • #18
I'm glad at least one person got it. :rolleyes:
 
  • #19
gnome said:
Too ambitious. I'll be happy when I finish the one that just identifies infinite loops. :-p

No problem. Say your computer can access at most N bits of storage. Then it can have at most [tex]2^N[/tex] distinct states. You just run the program for [tex]2^N[/tex] steps. :zzz: If it hasn't finished by then, its stuck in an infinite loop.
 
  • #20
chronon said:
No problem. Say your computer can access at most N bits of storage. Then it can have at most [tex]2^N[/tex] distinct states. You just run the program for [tex]2^N[/tex] steps. :zzz: If it hasn't finished by then, its stuck in an infinite loop.

The amount of internal storage space is fixed. But a computer's state isn't determined just by it's internal storage, since a computer can access the outside world.

Not only that, but your solution wouldn't work anyway. We don't have enough matter in the universe to construct enough memory to run your program.
 
  • #21
Sooo, where are the *relations* of your posts to the OP ?
 
  • #22
Concord said:
Sooo, where are the *relations* of your posts to the OP ?

My posts are all responses to other posts. If you follow the chain of responses up far enough, you reach the opening post.
 
  • #23
originally posted by Concord:
Sooo, where are the *relations* of your posts to the OP ?
Is this some sort of secret code?
 
  • #24
gnome said:
Is this some sort of secret code?

I think he was trying to not-so-subtly hint that we were off topic and should stop talking.
 

FAQ: Decompiling Program: How to Obtain Source Code?

What is decompiling and why is it used?

Decompiling is the process of converting a compiled program or code into its original source code. This is often used for debugging, understanding how a program works, or making modifications to the code.

Is it legal to decompile a program?

The legality of decompiling a program varies by country and jurisdiction. In some cases, it may be considered fair use for personal use, but it is always best to consult a legal professional before decompiling any program.

What tools are needed for decompiling a program?

There are various decompilers available, such as IDA Pro, Ghidra, and Jadx, that can be used for decompiling programs. These tools often require some technical knowledge and may not always be able to produce a perfect representation of the original source code.

What are the limitations of decompiling a program?

Decompiling a program may not always result in a perfect representation of the original source code. Some code may be missing or may be difficult to understand. Additionally, decompiled code may not always be executable and may require additional modifications to work properly.

Can any program be decompiled?

Not all programs can be decompiled. Some programs may have obfuscated or encrypted code that makes it difficult or impossible to decompile. Additionally, some programs may have been intentionally designed to prevent decompiling for security reasons.

Similar threads

Back
Top