Deep Dive: URL Captures
The complete metadata taxonomy—11 categories, 115+ fields extracted from web pages.
By Rouslan Zenetl
When you share a URL, Recall fetches the page and extracts structured metadata into 11 categories.
1. Content Classification
Identifies what kind of content this is and how the source categorizes it.
| Field | Swift Type | Description |
|---|---|---|
| type | String? | Content type: article, video, product, etc. |
| category | String? | Topic category from source |
| section | String? | Website section: Opinion, News, etc. |
| description | String? | Summary or excerpt |
| keywords | String? | Comma-separated topic tags |
| language | String? | ISO 639-1 language code |
| locale | String? | Full locale identifier (en_US) |
| locales | String? | Available locales, comma-separated |
| canonicalUrl | String? | Canonical URL for the content |
| license | String? | Content license |
2. Authorship
Who created and published the content.
| Field | Swift Type | Description |
|---|---|---|
| author | String? | Primary author name |
| authors | String? | All authors, comma-separated |
| contributors | String? | Contributors, comma-separated |
| publisher | String? | Publisher organization |
| organization | String? | Organization name |
| copyright | String? | Copyright statement |
| editor | String? | Editor name |
3. Temporal
When content was created, updated, and for media, how long it runs.
| Field | Swift Type | Description |
|---|---|---|
| publishedDate | Date? | Publication timestamp |
| modifiedDate | Date? | Last modification timestamp |
| expirationDate | Date? | Content expiration date |
| createdDate | Date? | Creation timestamp |
| duration | String? | Media duration (ISO 8601: PT15M30S) |
4. Media Assets
Visual and audio content. Thumbnails are automatically downloaded and cached.
| Field | Swift Type | Description |
|---|---|---|
| thumbnailUrl | String? | Preview image URL |
| imageUrl | String? | Primary image URL |
| imageWidth | String? | Image width in pixels |
| imageHeight | String? | Image height in pixels |
| videoUrl | String? | Video URL |
| audioUrl | String? | Audio URL |
5. Ratings & Reviews
User ratings for products and reviewed content.
| Field | Swift Type | Description |
|---|---|---|
| ratingValue | String? | Rating value (e.g., “4.5”) |
| ratingBest | String? | Maximum rating (e.g., “5”) |
| ratingCount | String? | Number of ratings |
| reviewCount | String? | Number of reviews |
6. Commercial
E-commerce data. Captures point-in-time price and availability.
| Field | Swift Type | Description |
|---|---|---|
| price | String? | Product price |
| priceCurrency | String? | ISO 4217 currency code |
| availability | String? | Stock status |
| brand | String? | Product brand |
| sku | String? | Stock keeping unit |
| condition | String? | new, used, refurbished |
7. Identifiers
Academic and publishing identifiers for citation.
| Field | Swift Type | Description |
|---|---|---|
| isbn | String? | International Standard Book Number |
| issn | String? | International Standard Serial Number |
| doi | String? | Digital Object Identifier |
| issueNumber | String? | Issue or volume number |
8. Geographic
Location data for events, venues, and place-based content.
| Field | Swift Type | Description |
|---|---|---|
| locationName | String? | Venue or place name |
| streetAddress | String? | Street address |
| city | String? | City name |
| region | String? | State/province/region |
| country | String? | Country name |
| postalCode | String? | ZIP/postal code |
9. Relationships
Series and episodic content positioning.
| Field | Swift Type | Description |
|---|---|---|
| seriesName | String? | Series or collection name |
| seasonNumber | String? | Season number |
| episodeNumber | String? | Episode number |
| volumeNumber | String? | Volume number |
| partOfSeries | String? | Series URL or identifier |
10. Social
Author and publisher social media presence.
| Field | Swift Type | Description |
|---|---|---|
| twitterUsername | String? | Twitter/X handle |
| twitterSiteId | String? | Twitter site ID |
| facebookAppId | String? | Facebook app ID |
| instagramUsername | String? | Instagram handle |
| linkedinProfile | String? | LinkedIn profile URL |
11. Site-Specific
Platform-specific fields for major sites.
YouTube
| Field | Swift Type | Description |
|---|---|---|
| videoId | String? | YouTube video ID |
| channel | String? | Channel name |
| channelUrl | String? | Channel URL |
| duration | String? | Video duration |
| viewCount | String? | View count |
GitHub
| Field | Swift Type | Description |
|---|---|---|
| repository | String? | Repository name |
| owner | String? | Repository owner |
| stars | String? | Star count |
| forks | String? | Fork count |
| language | String? | Primary language |
| topics | String? | Topics, comma-separated |
Twitter/X
| Field | Swift Type | Description |
|---|---|---|
| tweetId | String? | Tweet ID |
| username | String? | Username |
| retweets | String? | Retweet count |
| likes | String? | Like count |
| timestamp | Date? | Tweet timestamp |
Amazon
| Field | Swift Type | Description |
|---|---|---|
| asin | String? | Amazon Standard ID |
| productGroup | String? | Product category |
| features | String? | Key features |
Medium
| Field | Swift Type | Description |
|---|---|---|
| publication | String? | Publication name |
| readTime | String? | Estimated read time |
| clapCount | String? | Clap count |
ArXiv
| Field | Swift Type | Description |
|---|---|---|
| arxivId | String? | ArXiv identifier |
| categories | String? | Subject categories |
| pdfUrl | String? | PDF download URL |
Product Hunt
| Field | Swift Type | Description |
|---|---|---|
| productId | String? | Product identifier |
| votesCount | String? | Upvote count |
| commentsCount | String? | Comment count |
All fields optional—populated only when source HTML contains relevant data. Extracted from Open Graph, Twitter Cards, Schema.org, Dublin Core, and platform-specific meta tags.
References
- Open Graph Protocol — Facebook’s metadata standard for rich link previews
- Twitter Cards — X/Twitter’s markup for enhanced tweets
- Schema.org — Structured data vocabulary for search engines
- Dublin Core — Metadata standard for digital resources
- JSON-LD — Linked data format often used with Schema.org