php get meta tags

get_meta_tags

(PHP 4, PHP 5, PHP 7, PHP 8)

get_meta_tags — Extracts all meta tag content attributes from a file and returns an array

Description

Parameters

Example #1 What get_meta_tags() parses

Setting use_include_path to true will result in PHP trying to open the file along the standard include path as per the include_path directive. This is used for local files, not URLs.

Return Values

Returns an array with all the parsed meta tags.

The value of the name property becomes the key, the value of the content property becomes the value of the returned array, so you can easily use standard array functions to traverse it or access single values. Special characters in the value of the name property are substituted with ‘_’, the rest is converted to lower case. If two meta tags have the same name, only the last one is returned.

Returns false on failure.

Examples

Example #2 What get_meta_tags() returns

// Assuming the above tags are at www.example.com
$tags = get_meta_tags ( ‘http://www.example.com/’ );

Notes

Only meta tags with name attributes will be parsed. Quotes are not required.

See Also

User Contributed Notes 19 notes

This regex gets meta tags independent of sequence by capturing inside a lookahead.
Further uses the branch reset feature for different quote styles of values.
The pattern can be tested here: https://regex101.com/r/oE4oU9/1

If the URL is doing a redirection using the headers (like you would do with PHP function header(«Location: URL»);), the page has no content (in general). It appears get_meta_tags() doesn’t catch that kind of redirection (like cURL would do) and it lead me to a timeout of my script.

I experienced this in a spider I wrote in order to feed my database of all available pages on my site and one link was linking to a page that simply has the following code:

( «Location: sections.php?section=home» );
exit();
?>

That made my script hang for a moment and apparently, get_meta_tags() wasn’t even able to return me an error.

If you want to get the contents of tags other than meta you can use:

I personally experienced less issues using the DOM functions than regular expressions while trying to fetch meta tags and not using get_meta_tags function (in order to get http-equiv meta tags too).

This is a slight amendment to jimmyxx at gmail dot com function

I tried using the regex displayed in his code, and php threw up a couple of errors

Based on Michael Knapp’s code, and adding some regex, here’s a function that will get all meta tags and the title based on a URL. If there’s an error, it will return false. Using the function getUrlContents(), also included, it takes care of META REFRESH re-directions, following up to the specified number of redirections. Please note that the regular expressions included were split into strings because php.net was complaining about the line being to long 😉

// Check if we need to go somewhere else

?>

For the above code the output would be:

// Check if we need to go somewhere else

Array
(
[ title ] => The requested page ‘s title
[metaTags] => Array
(
[description] => Array
(
[html] =>
[value] => Something.
)
)
[metaProperties] => Array
(
[og:type] => Array
(
[html] => />
[value] => article
)
)
)
?>

Workaround: if possible move code after header or if not: include a file.

in response to
jp at webgraphe dot com

this function grabs meta tags, not http headers

if you need the headers

Tim’s code is good (thanks Tim), except it won’t work very well if the tag is part of a long non-breaking string.

E.g. try getting the title from Google Maps (http://www.google.com/maps).

A better solution is:

Also, it is probably best to use the /i modifier, because some people might code etc.

/*
** Extracts and formats meta tag content
*/

keywords (13 words | 119 chars)
SELFHTML, HTML, Dynamic HTML, JavaScript, CGI, Perl, Grafik, WWW-Seiten, Web-Seiten, Hilfe, Dokumentation, Beschreibung

Источник

get_meta_tags

get_meta_tags — Извлекает из файла содержимое всех мета-тегов и возвращает как ассоциативный массив

Описание

Список параметров

Пример #1 Что обрабатывает функция get_meta_tags()

Возвращаемые значения

Возвращает ассоциативный массив со значениями разобранных мета-тегов.

Список изменений

ВерсияОписание
4.0.5Добавлена поддержка HTML-атрибутов, не заключенных в кавычки.

Примеры

Пример #2 Что возвращает функция get_meta_tags()

// Предположим, что указанные выше мета-теги расположены на www.example.com
$tags = get_meta_tags ( ‘http://www.example.com/’ );

Примечания

Обрабатываются только мета-теги с атрибутом name.

Смотрите также

Коментарии

I have found that for large searches, get_meta_tags is very slow. I created a large search engine for a website that couldnt use a database and I first tried pulling out the meta tags.
I have found that it is actually much faster to use eregi to pull out the meta tags. This code below pulls out the description:

Workaround: if possible move code after header or if not: include a file.

If the URL is doing a redirection using the headers (like you would do with PHP function header(«Location: URL»);), the page has no content (in general). It appears get_meta_tags() doesn’t catch that kind of redirection (like cURL would do) and it lead me to a timeout of my script.

I experienced this in a spider I wrote in order to feed my database of all available pages on my site and one link was linking to a page that simply has the following code:

( «Location: sections.php?section=home» );
exit();
?>

That made my script hang for a moment and apparently, get_meta_tags() wasn’t even able to return me an error.

If you want to get the contents of tags other than meta you can use:

in response to
jp at webgraphe dot com

this function grabs meta tags, not http headers

if you need the headers

Tim’s code is good (thanks Tim), except it won’t work very well if the tag is part of a long non-breaking string.

E.g. try getting the title from Google Maps (http://www.google.com/maps).

A better solution is:

Also, it is probably best to use the /i modifier, because some people might code etc.

Based on Michael Knapp’s code, and adding some regex, here’s a function that will get all meta tags and the title based on a URL. If there’s an error, it will return false. Using the function getUrlContents(), also included, it takes care of META REFRESH re-directions, following up to the specified number of redirections. Please note that the regular expressions included were split into strings because php.net was complaining about the line being to long 😉

// Check if we need to go somewhere else

?>

For the above code the output would be:

This is a slight amendment to jimmyxx at gmail dot com function

I tried using the regex displayed in his code, and php threw up a couple of errors

/*
** Extracts and formats meta tag content
*/

keywords (13 words | 119 chars)
SELFHTML, HTML, Dynamic HTML, JavaScript, CGI, Perl, Grafik, WWW-Seiten, Web-Seiten, Hilfe, Dokumentation, Beschreibung

Источник

get meta description tag with xpath

I need the content the description and the keywords tag content. I have this code, but dont write anything. Idea?

5 Answers 5

You can reference the attributes using @ followed by the attribute name (see below), and you can query directly for the attributes; your XPath query was almost there.

In stead of including the /html/head part you could also use double slash which means that the following node can be anywhere in the code:

Will give the same result as:

Doesn’t really matter much but it’s less typing.

You have two problems. First, name is an attribute so you need to prepend @,

Second, the nodes are all empty so there is nothing to print.

To print the attribute value, do this,

Last but not least, and sorry for reviving this thread, the queries are case sensitive.

In other words, if you look for meta name=»description». or «meta name=»keywords», it will not find «meta name=»Description». or «meta name=»Keywords». respectively. So careful with that!

And I can tell you, after working a while with xdom and metatags, eventually I believe that the best approach for that is to use this function: http://php.net/manual/es/function.get-meta-tags.php

Make sure that you put EOD; at a line without any spaces and indentation like:

Not the answer you’re looking for? Browse other questions tagged php xpath or ask your own question.

Linked

Related

Hot Network Questions

Subscribe to RSS

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

site design / logo © 2021 Stack Exchange Inc; user contributions licensed under cc by-sa. rev 2021.9.17.40238

By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.

Источник

get_meta_tags

get_meta_tags — Извлекает из файла содержимое всех мета-тегов и возвращает как ассоциативный массив

Описание

Список параметров

Пример #1 Что обрабатывает функция get_meta_tags()

Возвращаемые значения

Возвращает ассоциативный массив со значениями разобранных мета-тегов.

Примеры

Пример #2 Что возвращает функция get_meta_tags()

// Предположим, что указанные выше мета-теги расположены на www.example.com
$tags = get_meta_tags ( ‘http://www.example.com/’ );

Примечания

Обрабатываются только мета-теги с атрибутом name. Кавычки не требуются.

Смотрите также

Коментарии

I have found that for large searches, get_meta_tags is very slow. I created a large search engine for a website that couldnt use a database and I first tried pulling out the meta tags.
I have found that it is actually much faster to use eregi to pull out the meta tags. This code below pulls out the description:

Workaround: if possible move code after header or if not: include a file.

If the URL is doing a redirection using the headers (like you would do with PHP function header(«Location: URL»);), the page has no content (in general). It appears get_meta_tags() doesn’t catch that kind of redirection (like cURL would do) and it lead me to a timeout of my script.

I experienced this in a spider I wrote in order to feed my database of all available pages on my site and one link was linking to a page that simply has the following code:

( «Location: sections.php?section=home» );
exit();
?>

That made my script hang for a moment and apparently, get_meta_tags() wasn’t even able to return me an error.

If you want to get the contents of tags other than meta you can use:

in response to
jp at webgraphe dot com

this function grabs meta tags, not http headers

if you need the headers

Tim’s code is good (thanks Tim), except it won’t work very well if the tag is part of a long non-breaking string.

E.g. try getting the title from Google Maps (http://www.google.com/maps).

A better solution is:

Also, it is probably best to use the /i modifier, because some people might code etc.

Based on Michael Knapp’s code, and adding some regex, here’s a function that will get all meta tags and the title based on a URL. If there’s an error, it will return false. Using the function getUrlContents(), also included, it takes care of META REFRESH re-directions, following up to the specified number of redirections. Please note that the regular expressions included were split into strings because php.net was complaining about the line being to long 😉

// Check if we need to go somewhere else

?>

For the above code the output would be:

This is a slight amendment to jimmyxx at gmail dot com function

I tried using the regex displayed in his code, and php threw up a couple of errors

/*
** Extracts and formats meta tag content
*/

keywords (13 words | 119 chars)
SELFHTML, HTML, Dynamic HTML, JavaScript, CGI, Perl, Grafik, WWW-Seiten, Web-Seiten, Hilfe, Dokumentation, Beschreibung

Источник

get_meta_tags

get_meta_tags — Извлекает из файла содержание атрибута content всех мета-тегов и возвращает как ассоциативный массив

Описание

Список параметров

Пример #1 Что обрабатывает функция get_meta_tags()

Если use_include_path равен TRUE, PHP будет искать файл используя стандартные пути поиска из директивы php.ini include_path. Это актуально только для локальных файлов, но не для URL.

Возвращаемые значения

Возвращает ассоциативный массив со значениями разобранных мета-тегов.

Список изменений

ВерсияОписание
4.0.5Добавлена поддержка HTML-атрибутов не заключенных в кавычки.

Примеры

Пример #2 Что возвращает функция get_meta_tags()

// Предположим, что указанные выше мета-теги расположены на www.example.com
$tags = get_meta_tags ( ‘http://www.example.com/’ );

Смотрите также

Коментарии

I have found that for large searches, get_meta_tags is very slow. I created a large search engine for a website that couldnt use a database and I first tried pulling out the meta tags.
I have found that it is actually much faster to use eregi to pull out the meta tags. This code below pulls out the description:

Workaround: if possible move code after header or if not: include a file.

If the URL is doing a redirection using the headers (like you would do with PHP function header(«Location: URL»);), the page has no content (in general). It appears get_meta_tags() doesn’t catch that kind of redirection (like cURL would do) and it lead me to a timeout of my script.

I experienced this in a spider I wrote in order to feed my database of all available pages on my site and one link was linking to a page that simply has the following code:

( «Location: sections.php?section=home» );
exit();
?>

That made my script hang for a moment and apparently, get_meta_tags() wasn’t even able to return me an error.

If you want to get the contents of tags other than meta you can use:

in response to
jp at webgraphe dot com

this function grabs meta tags, not http headers

if you need the headers

Tim’s code is good (thanks Tim), except it won’t work very well if the tag is part of a long non-breaking string.

E.g. try getting the title from Google Maps (http://www.google.com/maps).

A better solution is:

Also, it is probably best to use the /i modifier, because some people might code etc.

Based on Michael Knapp’s code, and adding some regex, here’s a function that will get all meta tags and the title based on a URL. If there’s an error, it will return false. Using the function getUrlContents(), also included, it takes care of META REFRESH re-directions, following up to the specified number of redirections. Please note that the regular expressions included were split into strings because php.net was complaining about the line being to long 😉

// Check if we need to go somewhere else

?>

For the above code the output would be:

This is a slight amendment to jimmyxx at gmail dot com function

I tried using the regex displayed in his code, and php threw up a couple of errors

/*
** Extracts and formats meta tag content
*/

keywords (13 words | 119 chars)
SELFHTML, HTML, Dynamic HTML, JavaScript, CGI, Perl, Grafik, WWW-Seiten, Web-Seiten, Hilfe, Dokumentation, Beschreibung

Источник

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *