Skip to content

Commit

Permalink
Merge pull request #899 from thephpleague/fix-embed-parsing
Browse files Browse the repository at this point in the history
Fix embed parsing
  • Loading branch information
colinodell authored Jul 17, 2022
2 parents da3ce30 + ce60e0d commit d3e531a
Show file tree
Hide file tree
Showing 9 changed files with 116 additions and 6 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@ Updates should follow the [Keep a CHANGELOG](https://keepachangelog.com/) princi

## [Unreleased][unreleased]

### Changed

- Made a number of small tweaks to the embed extension's parsing behavior to fix #898:
- Changed `EmbedStartParser` to always capture embed-like lines in container blocks, regardless of parent block type
- Changed `EmbedProcessor` to also remove `Embed` blocks that aren't direct children of the `Document`
- Increased the priority of `EmbedProcessor` to `1010`

### Fixed

- Fixed `EmbedExtension` not parsing embeds following a list block (#898)

## [2.3.3] - 2022-06-07

### Fixed
Expand Down
2 changes: 1 addition & 1 deletion src/Extension/Embed/EmbedExtension.php
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ public function register(EnvironmentBuilderInterface $environment): void

$environment
->addBlockStartParser(new EmbedStartParser(), 300)
->addEventListener(DocumentParsedEvent::class, new EmbedProcessor($adapter, $environment->getConfiguration()->get('embed.fallback')))
->addEventListener(DocumentParsedEvent::class, new EmbedProcessor($adapter, $environment->getConfiguration()->get('embed.fallback')), 1010)
->addRenderer(Embed::class, new EmbedRenderer());
}
}
16 changes: 13 additions & 3 deletions src/Extension/Embed/EmbedProcessor.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
use League\CommonMark\Event\DocumentParsedEvent;
use League\CommonMark\Extension\CommonMark\Node\Inline\Link;
use League\CommonMark\Node\Block\Paragraph;
use League\CommonMark\Node\Inline\Text;
use League\CommonMark\Node\NodeIterator;

final class EmbedProcessor
Expand All @@ -34,9 +35,18 @@ public function __construct(EmbedAdapterInterface $adapter, string $fallback = s

public function __invoke(DocumentParsedEvent $event): void
{
$embeds = [];
foreach (new NodeIterator($event->getDocument()) as $node) {
if ($node instanceof Embed) {
$document = $event->getDocument();
$embeds = [];
foreach (new NodeIterator($document) as $node) {
if (! ($node instanceof Embed)) {
continue;
}

if ($node->parent() !== $document) {
$replacement = new Paragraph();
$replacement->appendChild(new Text($node->getUrl()));
$node->replaceWith($replacement);
} else {
$embeds[] = $node;
}
}
Expand Down
3 changes: 1 addition & 2 deletions src/Extension/Embed/EmbedStartParser.php
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@

namespace League\CommonMark\Extension\Embed;

use League\CommonMark\Node\Block\Document;
use League\CommonMark\Parser\Block\BlockStart;
use League\CommonMark\Parser\Block\BlockStartParserInterface;
use League\CommonMark\Parser\Cursor;
Expand All @@ -24,7 +23,7 @@ class EmbedStartParser implements BlockStartParserInterface
{
public function tryStart(Cursor $cursor, MarkdownParserStateInterface $parserState): ?BlockStart
{
if ($cursor->isIndented() || $parserState->getParagraphContent() !== null || ! ($parserState->getLastMatchedBlockParser()->getBlock() instanceof Document)) {
if ($cursor->isIndented() || $parserState->getParagraphContent() !== null || ! ($parserState->getActiveBlockParser()->isContainer())) {
return BlockStart::none();
}

Expand Down
8 changes: 8 additions & 0 deletions tests/functional/Extension/Embed/EmbedExtensionTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

use League\CommonMark\ConverterInterface;
use League\CommonMark\Environment\Environment;
use League\CommonMark\Extension\Autolink\AutolinkExtension;
use League\CommonMark\Extension\CommonMark\CommonMarkCoreExtension;
use League\CommonMark\Extension\Embed\EmbedExtension;
use League\CommonMark\MarkdownConverter;
Expand All @@ -30,10 +31,17 @@ protected function createConverter(array $config = []): ConverterInterface
{
$config['embed']['adapter'] = new FakeAdapter();

$enableAutolinkExtension = $config['enable_autolinks'] ?? false;
unset($config['enable_autolinks']);

$environment = new Environment($config);
$environment->addExtension(new CommonMarkCoreExtension());
$environment->addExtension(new EmbedExtension());

if ($enableAutolinkExtension) {
$environment->addExtension(new AutolinkExtension());
}

return new MarkdownConverter($environment);
}

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<h1>Embed Bug</h1>
<ul>
<li>Both links below</li>
<li>Should be embeds</li>
</ul>
<iframe class="embed" src="https://www.youtube.com/watch?v=dQw4w9WgXcQ"></iframe>
<iframe class="embed" src="https://www.youtube.com/watch?v=dQw4w9WgXcQ"></iframe>
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Embed Bug

- Both links below
- Should be embeds

https://www.youtube.com/watch?v=dQw4w9WgXcQ

https://www.youtube.com/watch?v=dQw4w9WgXcQ
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<p>This is an embed:</p>
<iframe class="embed" src="https://www.youtube.com/watch?v=dQw4w9WgXcQ"></iframe>
<p>So is this, with only three spaces of indentation:</p>
<iframe class="embed" src="https://www.youtube.com/watch?v=dQw4w9WgXcQ"></iframe>
<p>This is not, because it's fully indented:</p>
<pre><code>https://www.youtube.com/watch?v=dQw4w9WgXcQ
</code></pre>
<p>This is not, because it has extra bits after the URL:</p>
<p><a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ">https://www.youtube.com/watch?v=dQw4w9WgXcQ</a> &lt;-- you gotta watch this!</p>
<p>This is not, because it's in a fenced code block:</p>
<pre><code class="language-md">
https://www.youtube.com/watch?v=dQw4w9WgXcQ

</code></pre>
<p>And this isn't either because it's inline: <img src="https://www.youtube.com/watch?v=dQw4w9WgXcQ" alt="" /></p>
<p>Embeds can't be nested in other blocks:</p>
<ul>
<li>https://www.youtube.com/watch?v=dQw4w9WgXcQ
<ul>
<li>https://www.youtube.com/watch?v=dQw4w9WgXcQ</li>
</ul>
</li>
</ul>
<p>This isn't valid because it's a lazy paragraph continuation:
<a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ">https://www.youtube.com/watch?v=dQw4w9WgXcQ</a></p>
<iframe class="embed" src="https://www.youtube.com/watch?v=dQw4w9WgXcQ"></iframe>
<p>^ This is fine, though</p>
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
enable_autolinks: true
---

This is an embed:

https://www.youtube.com/watch?v=dQw4w9WgXcQ

So is this, with only three spaces of indentation:

https://www.youtube.com/watch?v=dQw4w9WgXcQ

This is not, because it's fully indented:

https://www.youtube.com/watch?v=dQw4w9WgXcQ

This is not, because it has extra bits after the URL:

https://www.youtube.com/watch?v=dQw4w9WgXcQ <-- you gotta watch this!

This is not, because it's in a fenced code block:

```md

https://www.youtube.com/watch?v=dQw4w9WgXcQ

```

And this isn't either because it's inline: ![](https://www.youtube.com/watch?v=dQw4w9WgXcQ)

Embeds can't be nested in other blocks:

- https://www.youtube.com/watch?v=dQw4w9WgXcQ
- https://www.youtube.com/watch?v=dQw4w9WgXcQ

This isn't valid because it's a lazy paragraph continuation:
https://www.youtube.com/watch?v=dQw4w9WgXcQ

https://www.youtube.com/watch?v=dQw4w9WgXcQ
^ This is fine, though

0 comments on commit d3e531a

Please sign in to comment.