Complexity for generating strings of length k from CNF grammar

  • #1
CGandC
326
34
Homework Statement
For the algorithm below, explain whether the time and space complexities are Linear, Polynomial or Exponential ( find a tight upper-bound ) in . Assume the size of the rules set is constant.
Relevant Equations
Big-Oh notation, context-free languages
Here's the following code I've written for generating strings of length k given a CNF grammar ( Chomsky Normal Form grammar ) ( The code is a recursive variant of the CYK algorithm ). The algorithm is recursive and uses memoization.

Python:
def generate_language(rule_dict, start_var, k):
    mem = dict()
    return generate_language_rec(rule_dict, start_var, k, mem)def generate_language_rec(rule_dict, var, k, mem):
    if (var, k) in mem:
        return mem[(var, k)]

    s = set()
    if k == 0:
        s.add("")
        mem[(var, k)] = s
        return s

    if k == 1:
        for x in rule_dict[var]:
            if len(x) == 1:
                s.update(x[0])
        mem[(var, k)] = s
        return s

    for var_rule in rule_dict[var]:
        if len(var_rule) == 2:
            X, Y = var_rule[0], var_rule[1]
            for j in range(1, k):
                s1 = generate_language_rec(rule_dict, X, j  , mem)
                s2 = generate_language_rec(rule_dict, Y, k-j, mem)
                s.update( { str1 + str2 for str1 in s1 for str2 in s2 }.union({ str2 + str1 for str1 in s1 for str2 in s2 } ) ) ## concatenating strings in s1 and s2 in both ways , i.e., concatenating strings from st1 with strings from st2 and then I concatenate strings from st2 with strings in st1
    mem[(var, k)] = s
    return s

The algorithm works fine, here's an input example:
Python:
rule_dict = {"S": {"AB", "BC"}, "A": {"BA", "a"}, "B": {"CC", "b"}, "C": {"AB", "a"}}
res = generate_language(rule_dict, "S", 5)

Next, I was asked to analyze both the Time-complexity of the algorithm and its Space-complexity ( Space-complexity excluding the arguments of the function ), under the assumption that the rules set ( rules_dict ) of the grammar is of constant length. I was asked to explain whether the complexities are Linear, Polynomial or Exponential ( find a tight upper-bound ) in .

My answer was that the both the time and space complexities are Polynomial ( I coudn't explain for the space-complexity). Here's my explanation:

For Time-complexity:

For each recursive call with input , we pass over all rules whose derivation leads to two variables X and Y,
For variable X we check if it can be used to produce a set of strings of lengths
For variable Y we check if it can be used to produce a set of strings of lengths

Note, because of the Memoization, each recursive call with input will have recursive calls.
Hence,
For variable X, we'll have recursive calls in total
For variable Y, we'll have recursive calls in total

Since we assumed the size of the set of rules to be constant then the total time-complexity of the algorithm is
so the time-complexity is polynomial.

For Space-complexity:

I couldn't figure it out, maybe the space complexity is also ? I couldn't explain it, but I think the answer lies in the lines:
Python:
                s1 = generate_language_rec(rule_dict, X, j  , mem)
                s2 = generate_language_rec(rule_dict, Y, k-j, mem)
Questions:
What do you think about the time-complexity of the above algorithm? am I correct in my analysis that it is polynomial?
How can I proceed to analyze the space-complexity? I feel lost.

Thanks in advance for any help!
 
Last edited:
Physics news on Phys.org
  • #2
CGandC said:
What do you think about the time-complexity of the above algorithm? am I correct in my analysis that it is polynomial?
It looks credible, but I haven't checked that the analysis matches the code. Why don't you check it experimentally?

CGandC said:
How can I proceed to analyze the space-complexity? I feel lost.
Add up the size of all the memoized items?
 
  • Like
Likes CGandC
  • #3
1.
I've checked the running time of the code experimentally for different inputs, the code part of the analysis is:

Python:
rule_dict = {"S": {"AB", "BC"}, "A": {"BA", "a"}, "B": {"CC", "b"}, "C": {"AB", "a"}}
t_0 = time.perf_counter()
strings  = generate_language(rule_dict, "S", 10)
t_1 = time.perf_counter()
print(" DELTA ",t_1-t_0)
For k=5: DELTA 0.0005285000000001538
For k=6: DELTA 0.001191899999999968
For k=7: DELTA 0.0022890999999999884
For k=8: DELTA 0.004781599999999997
For k=9: DELTA 0.011498800000000031
For k=10: DELTA 0.022812100000000002

It seems the time-complexity is linear ( of the form y=2x ) Instead of polynomial. Maybe I was wrong in my analysis that the time-complexity is polynomial?

2. I think the Space-complexity will be linear, explanation:
The depth of the recursion tree is .
The function 'generate_language_rec' has a space-complexity of if we disregard recursive calls and input sizes.
We store all recursive calls for a constant number of variables ( since we assumed the size of the set of rules to be constant ) in a memoization table.
For input , for two variables X, Y, for each of them will be produced recursive calls ( the calls will be with inputs for X, for Y ), each such call will be put in a memoization table. So each variable causes calls in total ( which is what determines the depth of the recursion ) because of the memoization , hence, since the size of the set of rules is constant, the total space complexity is

( Maybe the explanation is a little hard to understand, but I could write it better if I sat more time on it )
 
  • #4
CGandC said:
I've checked the running time of the code experimentally for different inputs
For a number of reasons measuring execution time is not a good way of experimentally estimating time complexity, particularly with small values of n. Instead set a counter inside the tightest loop and consider the theoretical time complexity of the slowest action in that loop.
CGandC said:
I think the Space-complexity will be linear
This is more reliable experimentally: in this case just print sum(map(len, mem)) at the end.
 

Similar threads

Replies
10
Views
2K
Replies
1
Views
3K
Replies
7
Views
4K
Replies
75
Views
5K
Replies
1
Views
3K
Back
Top