Google cannot access CSS and JS Files

Discuss Search Engine Optimization in relation to Joomla! 3.x. This forum will also have discussions on SEF/SEO Joomla! 3.x extensions.

Moderator: General Support Moderators

Forum rules
Forum Rules
Absolute Beginner's Guide to Joomla! <-- please read before posting, this means YOU.
Forum Post Assistant - If you are serious about wanting help, you will use this tool to help you post.
Windows Defender SmartScreen Issues <-- please read this if using Windows 10.
User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Tue Jul 28, 2015 7:37 pm

I still don't understand the problem with letting them access the modules and plugins folder cart blanche. Anyone?

User avatar
conlippert
Joomla! Explorer
Joomla! Explorer
Posts: 481
Joined: Tue Feb 27, 2007 1:53 pm
Location: Ann Arbor, Michigan
Contact:

Re: Google cannot access CSS and JS Files

Post by conlippert » Tue Jul 28, 2015 8:05 pm

Just spent some time on Google's forum and Joomla, Drupal and Wordpress sites all over the world are getting this problem! Google needs to see all CSS and JS files to index the site properly? I find this really disturbing and yes, all my Joomla sites are popping up with this problem. Grrrr....vent.... My messages almost all refer to valid 3rd party component pieces I use like JCE! Seems like Google could have consulted with the development teams before changing the algorithm and causing all this upheaval.
Google thread here: https://productforums.google.com/forum/ ... v35X4paAcJ

User avatar
duanemitchell
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 100
Joined: Wed Jan 03, 2007 3:16 am
Location: Boston, MA USA
Contact:

Re: Google cannot access CSS and JS Files

Post by duanemitchell » Tue Jul 28, 2015 8:59 pm

Does it matter if an Allow follow after a Disallow with the same parent directory like this:

Code: Select all

Disallow: /libraries/
Allow: /libraries/gantry/css/grid-responsive.css
Allow: /libraries/gantry/js/browser-engines.js
Or can all the Allows be grouped together? Thus allowing cut and paste among the robots.txt files. That would be a time saver.
Next Generation Solutions | http://www.nxgnsol.com

User avatar
conlippert
Joomla! Explorer
Joomla! Explorer
Posts: 481
Joined: Tue Feb 27, 2007 1:53 pm
Location: Ann Arbor, Michigan
Contact:

Re: Google cannot access CSS and JS Files

Post by conlippert » Tue Jul 28, 2015 9:10 pm

This was suggested and worked for one of my sites. Add it to the bottom of the robots.txt file.

Code: Select all

User-Agent: Googlebot
Allow: /*.js*
Allow: /*.css*

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Tue Jul 28, 2015 9:31 pm

conlippert wrote:This was suggested and worked for one of my sites. Add it to the bottom of the robots.txt file.

Code: Select all

User-Agent: Googlebot
Allow: /*.js*
Allow: /*.css*
Yes, this is the stuff we need. Some way to let Google look but don't tell. Secret, wink wink kind of thing.

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Tue Jul 28, 2015 9:37 pm

I just added this and viola no partial results, instead a big fat complete with a shiny green check mark.

@conlippert thank you

User avatar
conlippert
Joomla! Explorer
Joomla! Explorer
Posts: 481
Joined: Tue Feb 27, 2007 1:53 pm
Location: Ann Arbor, Michigan
Contact:

Re: Google cannot access CSS and JS Files

Post by conlippert » Tue Jul 28, 2015 9:41 pm

Glad it worked...now on to my other >29 sites all of which Google is not paying me to change of course! Vent vent grrr!

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Tue Jul 28, 2015 9:52 pm

I hear that

deleted user

Re: Google cannot access CSS and JS Files

Post by deleted user » Tue Jul 28, 2015 10:20 pm

DaveOzric wrote:I still don't understand the problem with letting them access the modules and plugins folder cart blanche. Anyone?
Same reason you wouldn't want Google crawling the components or libraries folders I would say. These folders are mostly PHP files generating the content that gets rendered to your web browser. Now granted, unless someone explicitly posts a link to www.example.com/plugins/user/joomla/joomla.php Google shouldn't pick that up as a valid link, but because the entirety of Joomla's filesystem is in the publicly accessible web root, this is one measure to make sure Google isn't trying to follow links to the PHP files in your site.

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Tue Jul 28, 2015 11:04 pm

That makes sense. What doesn't is a directive to allow them access but no indexing for searches. Seems like this would solve many issues like this.

deleted user

Re: Google cannot access CSS and JS Files

Post by deleted user » Tue Jul 28, 2015 11:19 pm

That's a standard that would have to change far beyond Joomla; robots.txt is an Internet standard. Though it'd definitely be nice to see it improve in general.

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Tue Jul 28, 2015 11:24 pm

Google may drive the change if anyone will. Perhaps this is evolution.

Bruno Kopf
Joomla! Fledgling
Joomla! Fledgling
Posts: 1
Joined: Wed Jul 29, 2015 8:32 am

Re: Google cannot access CSS and JS Files

Post by Bruno Kopf » Wed Jul 29, 2015 8:42 am

@conlippert Thanks that solved it :)

I am however still getting one js blocked that seems to not be part of my site and causing an error with googlebot. My site is www.firedesire.co.za and the error url is http://static.doubleclick.net/instream/ad_status.js

can anyone tell me what this is and how i can remove it. Its not coming up on my other sites only the one.

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Wed Jul 29, 2015 11:42 am

I forgot about that. I am asking this on the Google forum.

lillianfidler
Joomla! Explorer
Joomla! Explorer
Posts: 414
Joined: Mon Mar 31, 2008 8:28 pm
Location: St. John's, Newfoundland, Canada
Contact:

Re: Google cannot access CSS and JS Files

Post by lillianfidler » Wed Jul 29, 2015 12:52 pm

Hi, thanks for that - can I ask a potentially silly question? How do we know which components to allow and aso cache files?

NHRADeuce
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 106
Joined: Sat Aug 26, 2006 8:26 pm
Location: Huntersville, NC
Contact:

Re: Google cannot access CSS and JS Files

Post by NHRADeuce » Wed Jul 29, 2015 12:56 pm

Bruno Kopf wrote:I am however still getting one js blocked that seems to not be part of my site and causing an error with googlebot. My site is http://www.firedesire.co.za and the error url is http://static.doubleclick.net/instream/ad_status.js

can anyone tell me what this is and how i can remove it. Its not coming up on my other sites only the one.
That file is not being served from your domain so not only can you not control it, Google will not penalize you. You will see that you are still getting the complete message instead of the dreaded partial. This particular file has something to do with a banner you are displaying on your site.
DaveOzric wrote:I still don't understand the problem with letting them access the modules and plugins folder cart blanche. Anyone?
There is absolutely no security issue with doing this. None. Robots.txt is not a security protocol.
conlippert wrote:This was suggested and worked for one of my sites. Add it to the bottom of the robots.txt file.

Code: Select all

User-Agent: Googlebot
Allow: /*.js*
Allow: /*.css*
This works great. For Google. I'd bet that this will be causing issues for Bing and likely other crawlers in the near future if it doesn't already. If you are upset that you had to spend a lot of time fixing this for Google, you're going to be really upset when you have to go back and do it again for Bing. And then every other legitimate crawler in your locale that follows suits. Crawler specific fixes are a bad idea for that reason.
mbabker wrote:
DaveOzric wrote:I still don't understand the problem with letting them access the modules and plugins folder cart blanche. Anyone?
Same reason you wouldn't want Google crawling the components or libraries folders I would say. These folders are mostly PHP files generating the content that gets rendered to your web browser. Now granted, unless someone explicitly posts a link to http://www.example.com/plugins/user/joomla/joomla.php Google shouldn't pick that up as a valid link, but because the entirety of Joomla's filesystem is in the publicly accessible web root, this is one measure to make sure Google isn't trying to follow links to the PHP files in your site.
The only reason to block anything on a Joomla site is to speed up the crawler and ease server load. Even if you made a sitemap of every single file in a Joomla install and submitted it to Google, none of those files will ever get indexed. The first line of executable code in a Joomla file should always be:

Code: Select all

defined('_JEXEC') or die;
That means every file that is access directly will display a blank page, which is ignored by Google (and other legitimate crawlers). Even if all of those files were to be indexed it wouldn't matter because all of the links would lead to blank pages. Again, this is not a security issue, this is a what can Google crawl issue.

The point of robots.txt is to keep publicly accessible files from being indexed. If the file cannot be directly accessed in a browser by a public, anonymous user, there is no need to block that file. Examples would be publicly accessible images you don't want in Google Images, or your 404 page, your terms and condition page, or other pages that serve no SEO value in being indexed.
Brent Friar
BNR Branding Solutions Website Development
bnrbranding.com

deleted user

Re: Google cannot access CSS and JS Files

Post by deleted user » Wed Jul 29, 2015 1:16 pm

The entire Joomla filesystem is publicly accessible. Therefore, Joomla ships a robots.txt file that suggests that its actual filesystem (minus certain resources which should generally be publicly accessible) not be indexed. And contrary to popular belief, not every developer includes the defined or die check, plus this check is not edited into third party PHP resources, so yes it is an extra "requirement" to keep in mind.

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Wed Jul 29, 2015 1:19 pm

You are correct, the other file not on my server disappeared from the list once I allowed the rest.

I understand there is no security issues with any of this. I am not convinced Google doesn't index all of your site regardless of the robots.txt file entries. They just want to know what you want indexed.

As for the other search engines, can't you just have a global robots directive for the js and css? Not just User-Agent: Googlebot

User avatar
conlippert
Joomla! Explorer
Joomla! Explorer
Posts: 481
Joined: Tue Feb 27, 2007 1:53 pm
Location: Ann Arbor, Michigan
Contact:

Re: Google cannot access CSS and JS Files

Post by conlippert » Wed Jul 29, 2015 1:41 pm

Seems reasonable to me to allow access to all css & js files and skip the googlebot directive. Does anyone know why Google wants access to css & js files as part of it's verification? Why not php and xml files also?

NHRADeuce
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 106
Joined: Sat Aug 26, 2006 8:26 pm
Location: Huntersville, NC
Contact:

Re: Google cannot access CSS and JS Files

Post by NHRADeuce » Wed Jul 29, 2015 2:21 pm

mbabker wrote:The entire Joomla filesystem is publicly accessible. Therefore, Joomla ships a robots.txt file that suggests that its actual filesystem (minus certain resources which should generally be publicly accessible) not be indexed. And contrary to popular belief, not every developer includes the defined or die check, plus this check is not edited into third party PHP resources, so yes it is an extra "requirement" to keep in mind.
Good point, not all devs use that check, just like many don't confine their js/css files to the recommended locations. In any case, even a file without the check the files wouldn't be indexed since they result in an error page if accessed directly.
DaveOzric wrote:You are correct, the other file not on my server disappeared from the list once I allowed the rest.

I understand there is no security issues with any of this. I am not convinced Google doesn't index all of your site regardless of the robots.txt file entries. They just want to know what you want indexed.

As for the other search engines, can't you just have a global robots directive for the js and css? Not just User-Agent: Googlebot
Google shouldn't be crawling any files that are not linked to from other files, but it's certainly possible that they touch files that are not linked to directly. There's no harm in removing the user agent and making it a global directive for all js/css files.
conlippert wrote:Seems reasonable to me to allow access to all css & js files and skip the googlebot directive. Does anyone know why Google wants access to css & js files as part of it's verification? Why not php and xml files also?
There are a couple of reasons for this. First, Google very specifically states that the page a user sees should be the same thing that Googlebot sees. This is mostly to prevent some blackhat SEO techniques that server crawler specific pages that differ from the ones a user sees. The other relates to the mobile friendly-ness of a site. If all of the css/js is not loaded, then Google cannot accurately access if the site is mobile friendly or not. Blocking access to those files could result in a site losing their mobile-friendly designation.

As far as other file types, Google only needs access to the files being used to render an HTML page. Since all of those files are called from within the Joomla framework and used to render the final HTML, Google does not need direct access to those files. Google is only interested in the HTML page being crawled and the files called within that page to render it properly.

Ignore the comment below, it is not factually correct. Robots.txt does indeed block access to files for legitimate crawlers and will reduce server load.
[quote="NHRADeuce"]The only reason to block anything on a Joomla site is to speed up the crawler and ease server load.[/quote]After giving this some thought, this may not even be the case. Robots.txt prevents a file from being indexed, it does not prevent it from being crawled. So really the only files that would need to be blocked would be publicly accessible files you don't want indexed.
Last edited by NHRADeuce on Wed Jul 29, 2015 2:52 pm, edited 1 time in total.
Brent Friar
BNR Branding Solutions Website Development
bnrbranding.com

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Wed Jul 29, 2015 2:35 pm

After giving this some thought, this may not even be the case. Robots.txt prevents a file from being indexed, it does not prevent it from being crawled. So really the only files that would need to be blocked would be publicly accessible files you don't want indexed.
This confuses me. Google wants to index these files or just crawl them. I was told they won't present any js or css files in search results so indexing them seems off. So crawling them is not enough. They need to index them to see the contents.

This whole thing seems really ridiculous to me. Google is getting like a senile person. All we are asking is that they don't give out these files in any results. Like privacy not access. What is so freaking difficult for this to be the case. If this is going to change then Google should implement a new robots standard call look at whatever the heck you want just don't tell anyone. WTF

This is reminiscent of my issues with ghost referral on my analytics. Why is that happening? I have to jump through all these hoops to create filters and this isn't even real traffic. Why are they making everything such a pain in the...?

NHRADeuce
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 106
Joined: Sat Aug 26, 2006 8:26 pm
Location: Huntersville, NC
Contact:

Re: Google cannot access CSS and JS Files

Post by NHRADeuce » Wed Jul 29, 2015 2:47 pm

DaveOzric wrote:This confuses me. Google wants to index these files or just crawl them. I was told they won't present any js or css files in search results so indexing them seems off. So crawling them is not enough. They need to index them to see the contents.

This whole thing seems really ridiculous to me. Google is getting like a senile person. All we are asking is that they don't give out these files in any results. Like privacy not access. What is so freaking difficult for this to be the case. If this is going to change then Google should implement a new robots standard call look at whatever the heck you want just don't tell anyone. WTF

This is reminiscent of my issues with ghost referral on my analytics. Why is that happening? I have to jump through all these hoops to create filters and this isn't even real traffic. Why are they making everything such a pain in the...?
I'm going to get this right one of these times.

Robots.txt does prevent a legitimate crawler from accessing the page at all. I confused a problem I had seen in the past with some crawlers not respecting robots.txt. So robots.txt will definitely reduce server load by keeping legitimate crawlers out of files that do not need to be indexed.

Relevant text from the robots.txt standard:
Disallow

The value of this field specifies a partial URL that is not to be visited.
Brent Friar
BNR Branding Solutions Website Development
bnrbranding.com

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Wed Jul 29, 2015 3:05 pm

So the allow entry gives them full discloser to do what they want with this file? All we ask is to have an entry that tells them to look but don't show anyone else. Is that so hard to implement.

Any unscrupulous spider or bot can access this and will ignore any directive. I look at my server logs and see hundreds of thousands of hits from Russia, Korea, China on my sites every month. So this whole thing is just plain stupid in my mind.

Again, the Google forum response was that Google will NOT give out css and js in any search results. If we could verify this then adding the allow js and css for the top search engines should be fine.

A note. I had Bing bring down my server a few times indexing it thousands of times a day. I verified it was them and it was so annoying I was loosing it. Obviously a broken algorithm on their end. So reducing server load it key.

deleted user

Re: Google cannot access CSS and JS Files

Post by deleted user » Wed Jul 29, 2015 3:15 pm

If Google can't crawl a file, it can't index it. If they can crawl it, and it meets their criteria for indexing, then it will be indexed. That's my interpretation of everything as probably the most non-SEO-informed person in this thread.

NHRADeuce
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 106
Joined: Sat Aug 26, 2006 8:26 pm
Location: Huntersville, NC
Contact:

Re: Google cannot access CSS and JS Files

Post by NHRADeuce » Wed Jul 29, 2015 4:30 pm

DaveOzric wrote:So the allow entry gives them full discloser to do what they want with this file? All we ask is to have an entry that tells them to look but don't show anyone else. Is that so hard to implement.
mbabker wrote:If Google can't crawl a file, it can't index it. If they can crawl it, and it meets their criteria for indexing, then it will be indexed. That's my interpretation of everything as probably the most non-SEO-informed person in this thread.
This. Google has to crawl it to determine if it should be indexed. In general, they won't index js/css/cgi etc.
DaveOzric wrote:Any unscrupulous spider or bot can access this and will ignore any directive. I look at my server logs and see hundreds of thousands of hits from Russia, Korea, China on my sites every month. So this whole thing is just plain stupid in my mind.

Again, the Google forum response was that Google will NOT give out css and js in any search results. If we could verify this then adding the allow js and css for the top search engines should be fine.

A note. I had Bing bring down my server a few times indexing it thousands of times a day. I verified it was them and it was so annoying I was loosing it. Obviously a broken algorithm on their end. So reducing server load it key.
Correct. The robots.txt protocol is a guide for the good guys and carries absolutely no weight with the actual accessing of files on a server. Many crawlers completely ignore robots.txt because the server itself does not use the file to determine what can and cannot be accessed.

As far as bing is concerned, if you haven't already done so you need a Bing WMT account. They allow you to control crawl frequency and scheduling so your server doesn't get hammered at peak times.
Brent Friar
BNR Branding Solutions Website Development
bnrbranding.com

SimonHayter
Joomla! Guru
Joomla! Guru
Posts: 530
Joined: Tue Nov 29, 2011 2:43 pm
Location: Bournemouth
Contact:

Re: Google cannot access CSS and JS Files

Post by SimonHayter » Wed Jul 29, 2015 6:10 pm

I have noticed this message occurring a lot lately and the problem doesn't appear to be a Joomla isolated case, it seems that Googlebot for some odd reason is reporting issues when they DO NOT actually exist.

I'd only recommend you go digging through robots and your htaccess if you know a issue actually exists, don't attempt to fix one before you know its one...

NHRADeuce
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 106
Joined: Sat Aug 26, 2006 8:26 pm
Location: Huntersville, NC
Contact:

Re: Google cannot access CSS and JS Files

Post by NHRADeuce » Wed Jul 29, 2015 11:14 pm

SimonHayter wrote:I have noticed this message occurring a lot lately and the problem doesn't appear to be a Joomla isolated case, it seems that Googlebot for some odd reason is reporting issues when they DO NOT actually exist.

I'd only recommend you go digging through robots and your htaccess if you know a issue actually exists, don't attempt to fix one before you know its one...
This is definitely not a Joomla isolated issue, it affect most CMSes since they basically all come with a robots.txt that limits access to the inner workings of the CMS. This also affects Wordpress and Drupal for sure.

It most certainly is an issue that exists, you can easily verify by following the instructions in the email GWMT is sending out. It's a simple fix to allow access to all js and css files throughout the file system.
Brent Friar
BNR Branding Solutions Website Development
bnrbranding.com

User avatar
AndresNWD
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 176
Joined: Thu Sep 06, 2012 1:45 pm
Location: Granada-Spain
Contact:

Re: Google cannot access CSS and JS Files

Post by AndresNWD » Thu Jul 30, 2015 3:26 pm

I've got several friends that work as wordpress developers and they've received some false negatives too. Some from websites with a robots txt with only a line:
Disallow: /wp-admin/
I work at
http://www.component-creator.com - Easy Joomla MVC development
http://www.neno-translate.com - The complete translation solution for Joomla

NHRADeuce
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 106
Joined: Sat Aug 26, 2006 8:26 pm
Location: Huntersville, NC
Contact:

Re: Google cannot access CSS and JS Files

Post by NHRADeuce » Thu Jul 30, 2015 8:32 pm

AndresNWD wrote:wordpress developers
Isn't that an oxymoron?

Sorry! Couldn't resist!! :D
Brent Friar
BNR Branding Solutions Website Development
bnrbranding.com

User avatar
DaveOzric
Joomla! Ace
Joomla! Ace
Posts: 1591
Joined: Sat May 22, 2010 10:29 pm
Contact:

Re: Google cannot access CSS and JS Files

Post by DaveOzric » Fri Jul 31, 2015 1:17 pm

This is pretty stupid. I just tested a site they sent me the email notice and got a complete in all versions on the fetch.

So which is it, blocked or not. Google seems to be slipping these days. Ghost traffic anyone?


Locked

Return to “Search Engine Optimization (Joomla! SEO) in Joomla! 3.x”