So It's Back To First Principles (Part 2)

His head turns before his shot suggest he heard suspicious noises, and others in the audience must have heard them also.

Here’s a video I found several days ago:

The person taking it is several rows in front of Mia Kopps, on the ground just to the left of the tractor and the south telehandler.
The main value is listening to the shots (you can really hear the first one ping off the railing), and watching the reactions of the people.

If you are a total nutcase like me, you can compare this with the Kopps video and notice how in the Kopps video some people have been deleted, some people inserted, and at least couple people have been given “extreme makeovers” (i.e. their race was changed). I really don’t know why in this case.

I’m pretty sure it was the Kopps video that was changed because I was suspicious of some of the people in it even before cross referencing with this new (to me) one.

1 Like

We all agree (except for @phiphi-the-frenchie ) that Shot 1 hit Trump’s ear?

Is this a special moment to celebrate? :dancer: :dancing_women: :man_dancing:

Using the FFmpeg method, I am providing additional evidence for @phiphi-the-frenchie that Trump was holding his ear with his hand when Shot 2 was fired. Here is the RSBN original footage, recorded by an independent camera team. This is first-hand video evidence:

https://www.youtube.com/watch?v=Rr63RgGN-Yo

In this frame, you can clearly see the exact moment when the “crack” sound is heard from shot 2 and Trump is holding his ear:

I am going to stop wasting time on PH’s objections and move forward.

OK, I will present my point of view so that we can analyze it all together.

2 Likes

I’m claiming that ToughOldBird’s right arm and elbow were hit.
FlowerLady does move her left arm and shoulder
I think shrapnel at her was more from behind her, rather that ‘from her right’.
The white vehicle, between FlowerLady and the TV, also might have protected her right side from the shrapnel spray toward her, from Shot1’s final impact point.
Or, maybe her right side motion was just her reaction to the bang behind her, with no shrapnel impact at all.
ToughOldBird’s back-arching reaction is the most pronounced, and maybe his elbow was unharmed by the rail impact subsonic shot, long gone, to the Southwest.
Copenhaver’s left arm was shot, so he’s separate from my shrapnel deflection
not just from ‘their right’ but ‘from behind’ those 3 people.
Dutch also turns his head toward Copenhaver, just before he(Dutch) is hit with Shot 2.

Copenhaver’s left arm does ‘raise’ in the same 1/24 second as Trump’s ear appears hit(Corey video)

Many other videos show dustball off the rail 7, 8, or 9 frames before Copenhaver’s arm begins to ‘raise’.
Doesn’t that prove that the rail impact was ‘early?’

I think this one contains a gunshot at the 1:15 mark.
The Mia Kopps video is the one I think contains a gunshot during Trump’s brief pause between
“comes right out of thee
”
POP!
“government services.”

I have used an extract of this RSBN video and I got the following:

Audacity gives me the second crack at 24.431 s and reading the video with QuickTime player frame by frame (30 frames/s) I see Trump beginning to raise his hand at 24 s + 19 frames or 24.633 s so about 0.2 seconds after.




BTW: I try to find the time when two troopers disappeared from this elevated position.



BTW: they play some studio music - but we can regognize what the real song was.

It’s the time to clarify that audio alignment uncertainty.

You’re absolutely right to point out that a video frame represents a time interval—not a single instant—and aligning audio data to video frames can be nuanced. Let’s unpack this a bit:

:clapper: Frame vs. Audio Timing

  • Video Frames: Each frame spans a duration based on the frame rate. For example, at 30 fps, each frame covers ~33.3 ms.
  • Audio Samples: Audio is continuous and sampled at a much higher rate (e.g., 44.1 kHz), so many audio samples fall within one video frame.

This mismatch means that when generating an audio graph per frame (e.g., waveform or spectrogram), you’re aggregating audio over that frame’s time span—not pinpointing a single moment.

(I guess 48k for video sound.)

:arrows_counterclockwise: Alignment Standards & Practices

There’s no universal standard for aligning continuous audio with video frames, but here are the common practices and variations:

1. FFmpeg’s Approach

  • FFmpeg typically aligns audio and video based on timestamps embedded in the media streams.
  • When extracting audio per frame, FFmpeg uses the frame’s timestamp range to pull the corresponding audio segment.
  • This is accurate if the media container (e.g., MP4) has properly synchronized timestamps.

2. Smartphone Camera Behavior

  • Android: May use separate audio and video pipelines. Audio is often captured with OpenSL ES or AudioRecord, while video uses Camera2 or MediaRecorder. Sync depends on how the app merges them.
  • iOS: AVFoundation handles both audio and video, and generally maintains tighter sync. But apps can still introduce drift if not careful.
  • Variability: Some devices or apps may start audio capture slightly before or after video, or may drop frames/audio samples under load.

3. Professional Cameras

  • High-end cameras often embed timecode (e.g., SMPTE) to ensure precise sync.
  • External audio recorders may use genlock or word clock to stay in sync with video.

:bar_chart: Implications for Audio Graphs

When generating audio graphs per frame:

  • You’re visualizing a windowed segment of audio, not a snapshot.
  • For precise alignment, use frame timestamps and audio sample timestamps.
  • If you’re seeing drift or misalignment, check whether the media container has variable frame rate or missing timestamps.

:mag: Want to go deeper?

Would you like help writing a script to extract and visualize audio per frame using FFmpeg and Python? Or are you working with a specific device or app where sync seems off? Let me know your setup and goals—I can tailor the advice.

I assume half frame time alignment uncertainty - when no further information is given.

Exactly—you nailed it with that assumption. When metadata is stripped (as it often is during social media compression or transcoding), and there’s no explicit timestamp alignment between audio and video, the safest default is to assume that each video frame represents a time window, and the audio associated with it is centered within that window. So yes: half the frame duration is a reasonable uncertainty margin.

For example:

  • At 30 fps, each frame spans ~33.3 ms.

  • Without timestamps, you’d assume the audio graph for that frame represents the segment centered at the midpoint—i.e., ±16.7 ms around the frame time.

And your lip-sync test? That’s actually one of the most reliable real-world validations. Human perception is incredibly sensitive to audio-visual sync—especially for speech. If the lips and voice feel aligned, then the audio is likely within ±40 ms of the video, which is generally imperceptible to most viewers.

:brain: A Few Nerdy Notes on Sync Tolerance

  • Broadcast standards (like ITU-R BT.1358) suggest audio should be within ±15 ms for perfect sync, and ±40 ms is considered acceptable.
  • Social platforms often re-encode media with variable frame rates and may slightly shift sync, but they usually preserve perceptual alignment.
  • Machine learning models (e.g., for lip-reading or speaker diarization) often assume audio is centered in the frame window unless timestamps are available.

In cases like this, where milliseconds matter, the key is understanding how the media was recorded and processed:

1. Frame-Based Audio Aggregation

  • If you’re analyzing video frame-by-frame, each frame spans a time window (e.g., 33.3 ms at 30 fps).
  • Audio is typically sampled at 44.1 kHz or 48 kHz, so each frame contains ~1,470 audio samples.
  • Without embedded timestamps, the best assumption is that the audio segment is centered on the frame time—meaning there’s a ±16.7 ms uncertainty.

2. Lip Sync as a Reference

  • Human speech articulation is tightly coupled to audio. If lip movements match the spoken words, then audio is likely aligned within ±20 ms.
  • If the speaker’s lips are in sync with their voice, but the hand movement seems delayed relative to the gunshot sound, it suggests the audio is not significantly misaligned—certainly not more than the frame duration.

3. Gunshot Reaction Timing

  • A gunshot is a sudden, high-energy impulse. The startle reflex (like flinching or grabbing the ear) typically occurs within 100–200 ms of the stimulus.
  • If the speaker begins to raise their hand after the first audible shot but only touches the ear after the second, it’s possible the reaction began during the first shot’s audio window, but the motion completed during the second.

:brain: What You Can Infer

Given your observations:

  • The speaker likely reacted to the first shot, and the motion spanned across the first and second frame/audio windows.
  • The audio track might be off by a small margin (±15–30 ms), but not enough to invalidate the sequence of events.
  • The fact that lip sync looks correct is a strong indicator that audio alignment is within acceptable forensic tolerance.

Studio Broadcasts: These are typically well-synchronized. Audio engineers align the podium mic feed with the video using timecode, delay compensation, and mixing tools. Whether they use an analog XLR line or a digital AES/EBU or Dante feed, they know how to handle latency—especially round-trip delays from processing or transmission.

Smartphones: These are a different beast. They record audio and video using separate subsystems, often with variable frame rates and no embedded timecode. Audio might lag or lead by tens of milliseconds, and echo from loudspeakers can contaminate the signal. So yes—smartphone footage is inherently less reliable for precise timing unless cross-referenced with other sources.

:loud_sound: Loudspeakers and Echo Contamination

  • If the podium mic is close enough to pick up sound from the loudspeakers, you’ll get secondary impulses—delayed versions of the original gunshot or speech.
  • The delay depends on distance: sound travels ~343 m/s, so a loudspeaker 10 meters away introduces ~29 ms of delay.
  • Echo cancellation is essential in live setups. Most professional rigs use directional mics, gating, and DSP-based echo suppression to isolate the podium voice and minimize ambient pickup.

:brain: What This Means for Your Gunshot Analysis

  • Studio footage is likely your most trustworthy source for timing—especially if lip sync is tight and the gunshot sounds are crisp.

  • Smartphone footage can be useful for triangulation, but you’ll need to account for:

    • Audio drift
    • Echo from loudspeakers
    • Frame rate variability
  • If the speaker reacts to the first shot but touches their ear after the second, and lip sync is intact, then the audio is likely aligned within ±20–30 ms—good enough to trust the sequence of events.

(to be continued
)

Using the command ffprobe, you can determine the frame rate of a video:

.\ffprobe -v 0 -of compact=p=0:nk=1 -select_streams v:0 -show_entries stream=r_frame_rate "BREAKING_ ASSASSINATION ATTEMPT ON PRESIDENT TRUMP AT BUTLER, P.A. RALLY- 7_13_2.mp4"

The result is 60/1, which corresponds to 60 frames per second, not 30.
This confirms that Trump raises his hand at 24.316 seconds, which occurs way before your “crack” of shot 2 at 24.431 seconds.

You have wasted so much of our time with incorrect analysis.

Perhaps you owe us an apology?

No the video I have used is an extract at 30 frames/s.
And I obtain the same result with other videos from great media.

Check it by using my method (Audacity + QuickTime) and tell me what is wrong if something is wrong !

As for apologizing to you, that would hurt my ass!

Nothing is wrong — I get the exact same result using your method.

I used an extract of the RSBN video, just as you did. I added timestamps, then extracted the audio and measured the timing of the “crack” from shot 2 at 61.937 seconds (00:01:01.937).

I then opened the same video in QuickTime and compared the frames:

  • At 00:01:01.733, Trump is raising his hand.

At 00:01:01.933, Trump is covering his ear with his hand.

If you are using QuickTime, I assume you have a Mac. FFmpeg also works on Mac, and it is free. In the future, you should always add timestamps to your video before making claims — it will save everyone a whole lot of time in this forum.

In any case, we have now provided sufficient proof that it was shot 1 which struck Trump’s ear, not shot 2.

To better understand each other, may I ask you a few very simple questions step by step, so we can work toward common ground? I think we already agree on about 85% of the points. It would be great if we could raise that to 99%, so we share a clear common view on this shooting.

The main goal is to challenge the lone-shooter narrative — specifically, the claim that Crooks fired all eight shots.

Let’s start simple and straightforward:

This X post from July 19, 2024 (six days after the shooting) shows a video of an impact on the top railing of the bleachers.

https://x.com/YWNReporter/status/1814294874860167397

My first question:

By looking at the video mentioned above, can we agree that the top corner of the bleachers was struck by a bullet? I’m not asking which shot it was or from where it came — just this single point.

My answer would definitely be “yes.”

1 Like

I don’t use time stamps but I use QuickTime’s elapsed time indicator, which allows me to display the elapsed time or the number of frames elapsed.
At 0:24, I have 706 frames, and at 0:25, 736, so that’s 30 frames per second.

Trump starts raising his hand at 725 frames, or about 0:24.63, or 0.2 seconds after the second crack at 0:24.431.

I don’t know why I’m not getting the same result as you.

You’re obviously doing something wrong. Go directly to YouTube:

https://www.youtube.com/watch?v=Rr63RgGN-Yo

Press the period (.) key to advance frame by frame; it takes exactly 60 keystrokes to move one second on the timeline, so the original video is definitly 60f/s.

If you can’t even download a YouTube video correctly, that suggests you don’t have the skills for this type of analysis and you’re wasting everyone’s time.

Suggestions of an explosive device, planted in the rail, were made by some.
I think the rail top was hit by a suppressed, subsonic gunshot, because I heard ‘dozens’ of them in Dayve’s audio
at least 5 of those shots were fired after Trump’s words, "
couple of months old, that chart
 "

You are getting ahead of your skies.

Since you’re getting involved, I’m asking you @flamecensor (and Sonja) a very simple yes-or-no question:

By watching this original video posted six days after the shooting:

https://x.com/YWNReporter/status/1814294874860167397

can we agree that a bullet struck the top corner of the bleachers?

It’s possible I could agree, but it’s no advantage to my case.
Deal suggestion: I’ll agree, if you agree that Copenhaver’s arm ‘raised’, or ‘began to raise’, less than 2 frames after Trump’s ear impact frame(from Corey’s 24 frame per second video.)

I’ll take that as a yes.

At this stage, we’re not debating — we’re simply confirming clear, well-documented facts in a straightforward, step-by-step manner.

We will get to Copenhaver and all the other details in due time.

By the way, apologies if I’ve missed something, but what exactly is your case? Since you refer so often to the Copenhaver court case, are you actually defending Copenhaver in court? It would help us better understand your motivation in this forum if you could share more details about “your case”.

@sonjax6, any comments from your side would be greatly appreciated.

Can we agree that, in the video mentioned above, we can clearly see a bullet striking the corner of the bleachers?

My ‘case’ is that almost everyone reporting on this crime is lying.
I refer to Copenhaver most, because he was shot twice, so 100% more than any other victim.
An apology for all the lies would be enough, and I’ll give credit to NBC’s Tom Llamas, for not lying.

That’s ridiculous. Even a subsonic fat bullet would be speeding at 1000 feet/second, so how could I claim that I see it(clearly or unclearly)?

1 Like

Oh yes, I agree the railing was hit by a bullet, at or very close to to the same time that Trump was struck.

I’m pretty sure this is one of the videos with AI generated people in the foreground - although it’s not from the media area. I looked at the sightlines - specifically what part of the barn is right behind the left side of the TV and what part of the crane is behind the right side of the TV, drew them on an aerial photo, and found out this video seems to have been taken from crazy far back in the audience - even further than RagTagVagabond.

As far as the faked people in the foreground: here are a couple of goofs:

I DO think the main part of the video is true, although I might be a little naĂŻve in that aspect.

Here’s another video of the south bleacher shot, taken from pretty close in:

One can make side-by-sides of frames of the two videos to try to see in what direction the puff of “smoke” is expanding.