How to lookup symbols at runtime?

In summary, the conversation discusses the use of libraries for symbol lookup in Linux, specifically for C++ name mangling. The concept of automation and COM is mentioned as a solution to the problem, as well as the use of HTML for GUI design and symbol lookup in C/C++. The conversation also touches on the idea of creating a management system and class factory for easily extendable and secure use of 3rd party libraries.
  • #1
TylerH
729
0
Is there a library with which I can lookup the value of a given symbol in Linux?

Also, I've read that C++ name manglingboth unstandardized and difficult. Are there libraries for GCC or Clang?

I want use these to build a GUI that uses HTML to design the interface and symbol lookup to allow it to interface with native code. Probably C++.
 
Technology news on Phys.org
  • #2
Hey TylerH.

When you say symbol lookup what kind of symbol are you talking about? I looked at the name mangling and I presume you are referring to some kind of function or variable with type information.

Microsoft solved this problem in the Component Object Model. Essentially the heart of COM is that everything is bound to an interface.

You query an interface to get the pointer to the actual function and by using the interface pointer as a gateway, it means you can store all the meta-data associated with the actual function/class/whatever.

COM had a standard interface for getting all the information at runtime and it's known as automation.

What a lot of developers do that need major metadata requirements or want to interface scripting languages or multiple languages together is that they create the metadata required to access all the information at run-time in a format that is easy to decipher and use in the context of the application.

There are many ways to do this and if you have not considered this in your design specs and have already written a tonne of code, I'm afraid that implementing this as a general feature (other than getting the compiler or another tool to do it for you) is not going to fun.

I would check out the concept of Automation if you want to do this: this was designed for the exact same problems that you are having now.

Also be aware that this has developed quite a bit since the OLE and COM days and is a bit more mature now in the design, development, and implementation.

You don't have to use existing frameworks out there, but the ideas are going to be essential.
 
  • #3
That sounds like what I'm talking about. I'm assuming that every symbol is stored is some kind of map (ie a data structure that has a key and a value), with the name (a string) being the key and a value being a reference/pointer to the actual place the symbol is stored.

I won't need type info, just what is necessary for dynamic linking (ie symbol/reference pairs). The way I am thinking of implementing it is just to use the standard HTML attributes (like "onclick") to specify a C/C++ function to call for user interactions. In short, I will be using C/C++ as a drop-in replacement for what would be done in Javascript. But the purpose would be to allow HTML be used for a GUI rather than using C/C++ with a browser.

What I don't know is how to do that symbol lookup. Like, if I need the C function "button_onclick()" in Linux/ELF, what do I call to get a reference to the function? I realize you basically gave an overview of how that works, but I wasn't able to find any API's using the search terms "automation API linux."
 
  • #4
Automation may be a windows only term (it was used when describing the above features for OLE and COM and later frameworks which were windows only).

What I recommend you do is construct a management system that has a registry of all classes and then uses that to do class, function, and data resolution.

Here is a brief outline:

First you create different class types each with their own purpose. They all derive off the base class: each class must return meta-data describing the object type (string and numeric ID) as well as any other simple meta-data you wish to have.

You then provide an interface (which is just a special abstract structure with a pointer to said structure) and then you define specific structures that have a v-table which is just a table of pointers in one column and a meta-data structure in the other column.

The resolution basically resolves the pointer from the meta-data structure.

The quickest way is to use a hash table that is big enough with a good algorithm to minimize collisions and make bin allocation as uniform as possible.

Alternatively you can just loop through all entries until you find what you want.

With those done, you now want to think of how you can create a framework that is easily extendable either through internal addition of functionality or through 3rd party libraries (like DLL's in windows or SO's in linux).

To do this you need to consider what is called a class factory. The class factory basically is given enough information about a class so that it can create it either from a new function (if it's internal) or through a 3rd party interface (if its in a 3rd party library).

The factory manages all objects that have been allocated and you need to include methods to allocate and de-allocate class instances, resolve pointers with named strings, have hierarchical functionality, dependency handling (i.e. if two objects are dependent) and also some kind of event system (like a publish subscribe) that allows external parts of the software to get messages and updates on what is going on.

So when you do this right, you set up the system that it is easily extendable by both 3rd party libraries or by adding internal classes.

This is good when for example you have a custom 3rd party library that implements all your functions.

It's also good for security: what you can do is add a simple trusted/banned functionality at any step of the factory or execution process that does not allow banned classes or classes that it is doesn't know about (in the same a firewall asks you about using a port that it doesn't know about or an application that it doesn't know about) so it can not use it.

It's not only good for security: if you have a 3rd party library that crashes all the time, you can just block it so that it doesn't freeze the rest of the execution going on.

So for resolving your on_click method in a custom internal class or 3rd party library, you just resort to the v_table to get the actual pointer.

The only thing left is known as marshalling: which is a way to prepare the data in one environment for use and execution of in another (again this is a COM term). It simply means that it takes the definition in one and converts it to a format for the new function and environment to use.

So you will need to have a standard way to marshall data. For marshalling what I recommend is that each data-type has its own class (with its own factory and registration) and that you put a simple export and import method in each class that returns a representation of that object in a different format.

I don't recommend you return an actual data structure, but rather a text definition of the object that the class factory uses to create the structure.

The class factory needs this standard function to create a new object given a text definition of the object (like a HTML tag based definition) for some type and returns the object pointer for the proper type if it was successful and returns a NULL pointer if not.

So that's my suggestion: it will require a little bit of time to construct the foundations but the main thing is that it's extensible and doesn't have to be modified much once you create functionality (especially with regards to extensibility) other than if you have other design considerations to make.
 
  • #5


I am familiar with the concept of symbol lookup at runtime and the use of libraries to aid in this process. In the Linux operating system, there are several libraries available for symbol lookup, such as libdl, libelf, and libbfd. These libraries allow for dynamic loading and linking of shared libraries, which can then be used to lookup symbols at runtime.

In regards to C++ name mangling, it is true that it can be unstandardized and difficult to work with. However, there are libraries available for both GCC and Clang that can help with this issue, such as demangle for GCC and llvm-cxxfilt for Clang.

It is possible to use these libraries to build a GUI that utilizes HTML for the interface and symbol lookup for native code, particularly in C++. However, it is important to carefully consider the compatibility and stability of these libraries when implementing them in your project. Additionally, it may be beneficial to consult with experienced programmers or software engineers for guidance and support in utilizing these libraries effectively.
 

FAQ: How to lookup symbols at runtime?

What is the purpose of looking up symbols at runtime?

Looking up symbols at runtime allows a program to access and use functions or variables that are not known until the program is running. This allows for more flexible and dynamic programming.

How do you lookup symbols at runtime?

The specific method for looking up symbols at runtime may vary depending on the programming language and environment. However, in general, it involves using a function or library that allows for dynamic symbol resolution, such as dlsym() in C or reflection in Java.

What is the difference between static and dynamic symbol lookup?

Static symbol lookup is done at compile time and involves linking symbols to their corresponding memory addresses. Dynamic symbol lookup, on the other hand, is done at runtime and allows for symbols to be resolved and used dynamically as the program is running.

Can symbols be looked up from external libraries?

Yes, symbols from external libraries can be looked up at runtime using dynamic symbol resolution. This allows for programs to use functions or variables from external libraries without having them explicitly linked at compile time.

What are the potential benefits of using runtime symbol lookup?

Runtime symbol lookup allows for more flexibility and adaptability in programming, as it allows for symbols to be resolved and used dynamically. This can make programs more efficient, as well as enable them to handle unexpected situations or changes in functionality.

Back
Top