AI Detection - Phase 2: finding hypotheses

  • I
  • Thread starter fresh_42
  • Start date
In summary, "AI Detection - Phase 2: finding hypotheses" focuses on developing and testing various hypotheses to improve the accuracy and effectiveness of AI detection systems. This phase involves gathering data, analyzing patterns, and refining detection algorithms based on the insights gained. The goal is to enhance the system's ability to identify AI-generated content through rigorous hypothesis testing and validation methodologies.
  • #1
fresh_42
Staff Emeritus
Science Advisor
Insights Author
2023 Award
19,755
25,758
Let us gather possible hypotheses here which we can test our data against. E.g.

jack action said:
From the intro of an Insight I wrote:

CC: 621 (100% human / 0% AI)
A: 100% AI (104 words)
B: 0.04% AI (113 tokens)

If this wasn't a typo, then it is the first indication of what I'm after: a very significant gap between A and B. How could that have happened?
@jack action can you quote the text?
 
Physics news on Phys.org
  • #2
fresh_42 said:
@jack action can you quote the text?

If a moving vehicle has an energy source that has a variable power output, the energy source must be set to its maximum power – during the entire velocity range – to ensure that the vehicle will get its maximum possible acceleration throughout that velocity range. At any given velocity: The force applied to the vehicle dictates the acceleration it gets; The power applied to the vehicle dictates the force it gets; Therefore, the maximum possible acceleration of the vehicle depends solely on the maximum power available for the vehicle. When it comes to accelerating a moving vehicle, only power tells the whole story.

To be fair - I did not notice earlier - zerogpt suggests **Please input more text for a more accurate result*. Still, the whole text is considered most/likely generated by AI.

I did not put the rest of the text as there were a lot of equations and variable names and I thought it would skew the result. Anyway, I tried the whole text just now:
Statement If a moving vehicle has an energy source that has a variable power output, the energy source must be set to its maximum power – during the entire velocity range – to ensure that the vehicle will get its maximum possible acceleration throughout that velocity range. At any given velocity: The force applied to the vehicle dictates the acceleration it gets; The power applied to the vehicle dictates the force it gets; Therefore, the maximum possible acceleration of the vehicle depends solely on the maximum power available for the vehicle. When it comes to accelerating a moving vehicle, only power tells the whole story. Explanations Force requirement The first basic requirement is given by Newton’s second law: The force F required is equal to the mass m of the vehicle times the desired acceleration a of the vehicle. In simple terms: F = ma. Power requirement But since there is a force in motion, work is done, so there is a second requirement: The power P required is equal to the force F applied to the vehicle times the velocity v of the vehicle. In simple terms: P = Fv. Putting it all together If the two equations are combined together, we get P = mav. This means that as long as there are a mass m and velocity v (i.e. not equal to zero), the power P required is proportional to the desired acceleration a. At this point, we can ignore Newton’s second law because it is indirectly implied in this new equation, i.e if the power requirement is fulfilled, the force requirement is also necessarily fulfilled. We have been talking about “desired acceleration” and “required power” until now but, in the real world, we are often given a power rating from an energy source and we take whatever acceleration we can get from it. In this case, the equation can be rewritten as a = P/(mv). With this new equation, assuming power and mass are constants, we can see that the acceleration is a function of velocity. Particularly, as the velocity increases, the acceleration will decrease. Since the mass m is a constraint given by the initial problem, it cannot be modified. The velocity v is also a constraint given by the initial problem, that is, it must be within the desired velocity range. So if one wants to increase the acceleration throughout the velocity range, one has no other choice but to increase the power available to the vehicle. If the power P is doubled, the acceleration a throughout the velocity range will also be doubled (remembering that the acceleration will still decrease as the velocity increases). Power is power Because of the law of conservation of energy, the power available to the vehicle is equal to the power given by the energy source powering the vehicle (not considering losses). The energy source can make its power with: a rotational system (P = torque times angular velocity); fluid power (P = pressure times volumetric flow rate); electricity (P = potential difference times current); combustion (P = fuel mass flow rate times fuel heating value); or any other way one can think of, it does not matter. Although, in any case, note that there may be some inefficiencies that will lead to some losses due to transformations between the energy source and the point of application on the vehicle. Obviously, only the power available at the point of application on the vehicle is relevant. A common mistake When considering the special case where a vehicle is powered by wheels of radius r, some people like to state they can link the acceleration directly to the wheel torque T, by using the relation F = T/r instead of the power equation we used. Combining this equation with Newton’s second law, they get T/r = ma and claim that it is a more direct way because the wheel radius r is constant (unlike the velocity v). But where does that radius comes from? Are we allowed to choose any value? The equation F = T/r is subjected to the law of conservation of energy which extends to power, namely, Pin = Pout. With a rotating object, Pin = Tω (where ω is the object angular velocity) and Pout = Fv. This means that T/F = v/ω. So if T/F = r, then v/ω = r as well. The radius r implies a transformation where power is kept constant, and that cannot be ignored. Replacing r with the velocity ratio in the misleading equation will give Tω/v = ma. Thus we get back to our original equation: P = mav. The introduction of the wheel radius does not simplify the process, it just hides the important notion of conservation of energy. Even with this special case [1], there are no ways around it, one way or another, power will have to be considered because velocity must be considered when accelerating a moving vehicle.
And now it says that it is most likely human-written with 12.77% AI. The same portion of the text is the suspected AI-generated text.

For its part, openai dropped to 0.02% AI, based on the first 510 tokens among the total 980.
 
  • #3
That matches my experiences and is evidence for my hypothesis assuming that "....; The ..." is a technical mistake. Or is it allowed in English to continue with a capital letter after a semicolon? It would be a mistake in German, but I don't know such subtleties in English.

So, the length of a sample might make a difference and ZeroGPT needs lengthier samples than OpenAI.
 
  • #4
fresh_42 said:
So, the length of a sample might make a difference and ZeroGPT needs lengthier samples than OpenAI.
Even with the extra text, ZeroGPT assumes the initial text is AI-generated.
fresh_42 said:
Or is it allowed in English to continue with a capital letter after a semicolon?
No, it's not. But I don't like it and I often write a capital letter. I try to correct myself but it is still an ongoing inner battle for me every time. :smile: I'm even worse with a colon (where a capital letter shouldn't also be required in most cases, but may be allowed sometimes).
 
  • #5
fresh_42 said:
How could that have happened?
If the point of the thread is to answer this question, perhaps the most likely explanation is that the Insight, now that it is on the web, is part of the training set for one or more of these programs.

If the point is that there is a weak correlation between A and B, that can be explained if they are looking at different things. If trying to identify whether a vehicle is a fire engine, one might ask "is it big?" and the other "is it red?". Both will agree a small black sports car is not a fore engine.

One could always produce a better estimator by usingh the output of A and B as inputs to either a classical statistical combination or something more complex and ML-like.
 
  • #6
Vanadium 50 said:
If the point of the thread is to answer ...
  1. Which one of the two is more trustworthy in general and for which kind of texts?
  2. Which circumstances trigger false answers?
    By now, it looks as if a couple of errors trigger the call "human" on both and independent of the real source.
  3. We have had examples where one said ~100% human and the other one ~100% fake.
    What are the reasons for such discrepancies?
    And I do not mean the underlying model which we cannot know much about.
    I mean, which text has to go to which engine in order to receive a trustworthy result?
The goal is not to analyze the detection machines in terms of their construction plans or training levels. The goal - as I see it - is to decide when to ask one bot and when to ask the other one, or ideally: always ask bot xy. I am a user and I want to know which one I should use; possibly depending on the characteristics of the input text (length, errors, commata, formulas, links, etc.)
 

FAQ: AI Detection - Phase 2: finding hypotheses

What is AI Detection - Phase 2: Finding Hypotheses?

AI Detection - Phase 2: Finding Hypotheses refers to the stage in AI research and development where the focus is on identifying and formulating hypotheses that can be tested using AI algorithms. This phase involves leveraging AI's capabilities to sift through large datasets, recognize patterns, and propose potential hypotheses that can drive further scientific inquiry or problem-solving.

How does AI help in formulating hypotheses?

AI helps in formulating hypotheses by analyzing vast amounts of data quickly and efficiently, identifying patterns and correlations that may not be immediately apparent to human researchers. Machine learning algorithms can process complex datasets, suggest potential relationships, and generate hypotheses based on statistical significance and predictive modeling.

What are the challenges associated with AI Detection - Phase 2?

The challenges associated with AI Detection - Phase 2 include ensuring the accuracy and reliability of the AI-generated hypotheses, dealing with biases in the data, and interpreting the results in a meaningful way. Additionally, there is the challenge of integrating AI findings with existing scientific knowledge and ensuring that the hypotheses are testable and relevant.

What techniques are commonly used in AI to find hypotheses?

Common techniques used in AI to find hypotheses include machine learning algorithms such as neural networks, decision trees, and clustering algorithms. Natural language processing (NLP) can also be used to analyze textual data and extract potential hypotheses. Additionally, statistical methods and data mining techniques play a crucial role in identifying significant patterns and correlations.

What is the importance of AI Detection - Phase 2 in scientific research?

AI Detection - Phase 2 is important in scientific research as it enhances the ability to generate new and innovative hypotheses, accelerates the research process, and improves the efficiency of data analysis. By automating the initial stages of hypothesis generation, AI allows researchers to focus on testing and validating these hypotheses, ultimately leading to more rapid scientific advancements and discoveries.

Similar threads

Replies
4
Views
2K
Replies
0
Views
1K
Replies
100
Views
7K
Replies
9
Views
4K
Replies
1
Views
3K
Replies
11
Views
26K
Replies
1
Views
3K
Replies
8
Views
4K
Back
Top