Search Results | Polemic Digital

151 items found for ""

How To Get In To Google News – My Moz Whiteboard Friday
I was on an extended trip through the USA late last year, with stops in New York, Las Vegas, and Seattle. The first two were primarily for work but the latter was mostly for relaxation. Nonetheless, when you’re an SEO and visiting Seattle you should always make an attempt to visit the Moz offices. Moz is probably the most famous SEO-focused company in the world, and their blog has been setting the standard for excellent SEO content for years. Every Friday, Moz publishes a short video in which a particular aspect or concept of SEO is explained. These are their so-called ‘Whiteboard Friday’ videos, because the format is a presenter in front of a whiteboard. Simple yet highly effective, and widely copied as well. When I visited the Moz offices last year, the lovely folks there asked me if I’d like to record a Whiteboard Friday. That was a bit of a no-brainer – I jumped at the opportunity. The topic I chose is one that doesn’t get covered often in the SEO industry, and is close to my heart: how to get a news site included in Google’s separate Google News index. The recording went smoothly and the Moz folks did their usual post-production magic before it was published on their site earlier this month. So here then is my Moz Whiteboard Friday – How to get in to Google News: 1. Have a dedicated news site A subsection of a commercial site will not be accepted. Ensure your site is a separate entity focused entirely on providing news and background content. Having multiple authors and providing unique news-worthy content is also highly recommended. 2. Static URLs for articles and sections Google wants your articles and section pages to remain on the same URLs so that they can be recrawled regularly. A new URL means a new article for Google News, so if your article URLs change then it can cause problems for Google News. 3. Plain HTML Due to the speed with which news changes, Google News only uses the first-stage indexing process in Google’s indexing ecosystem. As such, it’s important that your entire article content is present in the HTML source and doesn’t require any client-side code (such as JavaScript) to be rendered. Furthermore there are some technical aspects that are not required but strongly recommended: Separate news-specific XML sitemap for all your news articles published in the last 48 hours. (News)Article structured data to help Google index and categorise your articles quickly. Lastly, if your news site covers a specific niche or specialised topic, that tends to help with being accepted in to Google News. There are plenty of general news sites already, and Google News doesn’t really need more of those. Specialised news sites focusing on a specific niche will help broaden Google News’s scope, so you’ll find it a bit easier to get in to Google News when your site has such a focus. Make sure you watch the full video on the Moz blog, and give it a thumbs up there if you enjoyed it.
Technical SEO in the Real World
In September 2018 I gave a talk at the awesome Learn Inbound conference in Dublin, where I was privileged to be part of a speaker lineup that included Britney Muller, Wil Reynolds, Ian Lurie, Aleyda Solis, Paddy Moogan, Laura Crimmons, Jon Myers, and many more excellent speakers. My talk was about some of the more interesting technical SEO conundrums I’ve encountered over the years. The folks at Learn Inbound recorded the talk and have made it available for viewing: I gave an updated version of this talk two weeks later at BrightonSEO, so if you missed either one of those you can now watch it back for yourself.
Google AMP Can Go To Hell
Let’s talk about Accelerated Mobile Pages, or AMP for short. AMP is a Google pet project that purports to be “an open-source initiative aiming to make the web better for all”. While there is a lot of emphasis on the official AMP site about its open source nature, the fact is that over 90% of contributions to this project come from Google employees, and it was initiated by Google. So let’s be real: AMP is a Google project. Google is also the reason AMP sees any kind of adoption at all. Basically, Google has forced websites – specifically news publishers – to create AMP versions of their articles. For publishers, AMP is not optional; without AMP, a publisher’s articles will be extremely unlikely to appear in the Top Stories carousel on mobile search in Google. And due to the popularity of mobile search compared to desktop search, visibility in Google’s mobile search results is a must for publishers that want to survive in this era of diminishing revenue and fierce online competition for eyeballs. If publishers had a choice, they’d ignore AMP entirely. It already takes a lot of resources to keep a news site running smoothly and performing well. AMP adds the extra burden of creating separate AMP versions of articles, and keeping these articles compliant with the ever-evolving standard. So AMP is being kept alive artificially. AMP survives not because of its merits as a project, but because Google forces websites to either adopt AMP or forego large amounts of potential traffic. And Google is not satisfied with that. No, Google wants more from AMP. A lot more. Search Console Messages Yesterday some of my publishing clients received these messages from Google Search Console: Take a good look at those messages. A very good look. These are the issues that Google sees with the AMP versions of these websites: “The AMP page is missing all navigational features present in the canonical page, such as a table of contents and/or hamburger menu.”“The canonical page allows users to view and add comments, but the AMP article does not. This is often considered missing content by users.”“The canonical URL allows users to share content directly to diverse social media platforms. This feature is missing on the AMP page.”“The canonical page contains a media carousel that is missing or broken in the AMP version of the page.” Basically, any difference between the AMP version and the regular version of a page is seen as a problem that needs to be fixed. Google wants the AMP version to be 100% identical to the canonical version of the page. Yet due to the restrictive nature of AMP, putting these features in to an article’s AMP version is not easy. It requires a lot of development resources to make this happen and appease Google. It basically means developers have to do all the work they already put in to building the normal version of the site all over again specifically for the AMP version. Canonical AMP The underlying message is clear: Google wants full equivalency between AMP and canonical URL. Every element that is present on a website’s regular version should also be present on its AMP version: every navigation item, every social media sharing button, every comment box, every image gallery. Google wants publishers’ AMP version to look, feel, and behave exactly like the regular version of the website. What is the easiest, most cost-efficient, least problematic method of doing this? Yes, you guessed it – just build your entire site in AMP. Rather than create two separate versions of your site, why not just build the whole site in AMP and so drastically reduce the cost of keeping your site up and running? Google doesn’t quite come out and say this explicitly, but they’ve been hinting at it for quite a while. It was part of the discussion at AMP Conf 2018 in Amsterdam, and these latest Search Console messages are not-so-subtle hints at publishers: fully embracing AMP as the default front-end codebase for their websites is the path of least resistance. That’s what Google wants. They want websites to become fully AMP, every page AMP compliant and adhering to the limitations of the AMP standard. The Google-Shaped Web The web is a messy, complicated place. Since the web’s inception developers have played loose and fast with official standards, and web browsers like Netscape and Internet Explorer added to this mess by introducing their own unofficial technologies to help advance the web’s capabilities. The end result is an enormously diverse and anarchic free-for-all where almost no two websites use the same code. It’s extremely rare to find websites that look good, have great functionality, and are fully W3C compliant. For a search engine like Google, whose entire premise is based on understanding what people have published on the web, this is a huge challenge. Google’s crawlers and indexers have to be very forgiving and process a lot of junk to be able to find and index content on the web. And as the web continues to evolve and becomes more complex, Google struggles more and more with this. For years Google has been nudging webmasters to create better websites – ‘better’ meaning ‘easier for Google to understand’. Technologies like XML sitemaps and schema.org structured data are strongly supported by Google because they make the search engine’s life easier. Other initiatives like disavow files and rel=nofollow help Google keep its link graph clean and free from egregious spam. All the articles published on Google’s developer website are intended to ensure the chaotic, messy web becomes more like a clean, easy-to-understand web. In other words, a Google-shaped web. This is a battle Google has been fighting for decades. And the latest weapon in Google’s arsenal is AMP. Websites built entirely in AMP are a total wet dream for Google. AMP pages are fast to load (so fast to crawl), easy to understand (thanks to mandatory structured data), and devoid of any unwanted clutter or mess (as that breaks the standard). An AMPified web makes Google’s life so much easier. They would no longer struggle to crawl and index websites, they would require significantly less effort to extract meaningful content from webpages, and would enable them to rank the best possible pages in any given search result. Moreover, AMP allows Google to basically take over hosting the web as well. The Google AMP Cache will serve AMP pages instead of a website’s own hosting environment, and also allow Google to perform their own optimisations to further enhance user experience. As a side benefit, it also allows Google full control over content monetisation. No more rogue ad networks, no more malicious ads, all monetisation approved and regulated by Google. If anything happens that falls outside of the AMP standard’s restrictions, the page in question simply becomes AMP-invalid and is ejected from the AMP cache – and subsequently from Google’s results. At that point the page might as well not exist any more. Neat. Tidy. Homogenous. Google-shaped. Dance, Dance for Google Is this what we want? Should we just succumb to Google’s desires and embrace AMP, hand over control of our websites and content to Google? Yes, we’d be beholden to what Google deems is acceptable and publishable, but at least we’ll get to share in the spoils. Google makes so much money, plenty of companies would be happy feeding off the crumbs that fall from Google’s richly laden table. It would be easy, wouldn’t it? Just do what Google tells you to. Stop struggling with tough decisions, just let go of the reins and dance to Google’s fiddle. Dance, dance like your company’s life depends on it. Because it does. You know what I say to that? No. Google can go to hell. Who are they to decide how the web should work? They didn’t invent it, they didn’t popularise it – they got filthy rich off of it, and think that gives them the right to tell the web what to do. “Don’t wear that dress,” Google is saying, “it makes you look cheap. Wear this instead, nice and prim and tidy.” F#&! you Google, and f#&! the AMP horse you rode in on. This is the World Wide Web – not the Google Wide Web. We will do as we damn well please. It’s not our job to please Google and make our websites nice for them. No, they got this the wrong way round – it’s their job to make sense of our websites, because without us Google wouldn’t exist. Google has built their entire empire on the backs of other people’s effort. People use Google to find content on the web. Google is just a doorman, not the destination. Yet the search engine has epic delusions of grandeur and has started to believe they are the destination, that they are the gatekeepers of the web, that they should dictate how the web evolves. Take your dirty paws off our web, Google. It’s not your plaything, it belongs to everyone. Fight Back Some of my clients will ask me what to do with those messages. I will tell them to delete them. Ignore Google’s nudging, pay no heed. Google is going to keep pushing. I expect those messages to turn in to warnings, and eventually become full-fledged errors that invalidate the AMP standard. Google wants a cleaner, tidier, less diverse web, and they will use every weapon at their disposal to accomplish that. Canonical AMP is just one of those weapons, and they have plenty more. Their partnership with the web’s most popular CMS, for example, is worth keeping an eye on. The easy thing to do is to simply obey. Do what Google says. Accept their proclamations and jump when they tell you to. Or you could fight back. You could tell them to stuff it, and find ways to undermine their dominance. Use a different search engine, and convince your friends and family to do the same. Write to your elected officials and ask them to investigate Google’s monopoly. Stop using the Chrome browser. Ditch your Android phone. Turn off Google’s tracking of your every move. And, for goodness sake, disable AMP on your website. Don’t feed the monster – fight it.
Google News vs Donald Trump: Bias in Google’s Algorithms?
This morning US president Donald Trump sent out a few tweets about Google News. Since optimising news publishers for Google News is one of my key specialities as a provider of SEO services, this piqued my interest more than a little. In his tweets, Trump accuses Google News of having a liberal anti-Trump bias: “96% of results on “Trump News” are from National Left-Wing Media, very dangerous. Google & others are suppressing voices of Conservatives and hiding information and news that is good. They are controlling what we can & cannot see. This is a very serious situation-will be addressed!” The source of Trump’s information regarding Google News’s perceived bias is the right-wing blog PJ Media, who published a story about the sites that Google News lists when searching for ‘Trump’ in Google and selecting the ‘News’ tab in search results. According to PJ Media, “Not a single right-leaning site appeared on the first page of search results.” This is the chart that PJ Media used to determine if a listed news site is right-wing or left-wing: Putting aside the questionable accuracy of this chart and the tiny sample size of PJ Media’s research, there is a valid underlying question: can algorithms be truly neutral? Google News vs Regular Google Search First of all we need to be clear about what we mean when we say ‘Google News’. Google’s search ecosystem is vast, complex, and intricately linked. Originally, Google News was a separate search vertical that allowed people to search for news stories. It was soft-launched in beta in 2002 and officially launched in 2006. Then, in 2007, came Universal Search. Google started combining results from different verticals – images, videos, news, shopping – with its regular web search results. This was the start of Google’s SERPs as we know them today: rich results pages where pages from the we are combined with relevant news stories, images, and knowledge graph information. This is still the norm today. Take, for example, Google’s regular web search result for ‘trump’: In just this one search results we have a knowledge panel on the right with information on Trump, related movies & TV shows, Trump’s official social media profiles, and a ‘People also search for’ box. In the main results area we have a Top Stories carousel followed by recent tweets from Donald Trump, relevant videos, a ‘People also ask’ box of related searches, a box with other US presidents, another box with political leaders, and a box with people relevant to Trump’s wife Ivana. And amidst all this there are nine ‘regular’ web search results. While Trump’s official website is listed, it’s not the first regular result and the page is dominated by results from publishers: The Guardian, BBC, The Independent, Washington Post, The Atlantic, Vanity Fair, and NY Magazine. There’s a reason publishers tend to dominate such search results – I’ve given conference talks about that topic – but believe it or not, that’s not where news websites get the majority of their Google traffic from. Nor is the news.google.com vertical a particularly large source of traffic: it only accounts for approximately 3% of traffic to news sites. So where does publishers’ search traffic come from? Well, news publishers depend almost entirely on the Top Stories carousel for their search traffic: Especially on mobile devices (which is where the majority of Google searches happen) the Top Stories carousel is a very dominant feature of the results page: According to research from Searchmetrics, this top stories box appears in approximately 11.5% of all Google searches which amounts to billions of search results pages every single day. This is why news publishers work so very hard to appear in that Top Stories carousel, even when it means implementing technologies like AMP which are contrary to most news organisation’s core principles but is a requirement for appearing in Top Stories on mobile. Of course, search is not the only source of traffic for news publishers, but it is by far the largest: News publishers don’t really have much of a choice: they either play by Google’s rules to try and claim visibility in Google News, or try and survive on the scraps that fall from Google’s table. For me the interesting question is not ‘is Google News biased?’ but ‘how does Google select Top Stories?’ The answer to that question has three main elements: technology, relevancy, and authority. The Technology of Google News & Top Stories The technical aspects of ranking in the Top Stories carousel are fairly straightforward, but by no means simple. First of all, the news site has to be included in the Google News index. This is not optional – according to NewsDashboard, over 99% of articles shown in Top Stories are from websites that are included in the Google News index. Because this news index is manually maintained, there is an immediate opportunity for accusations of bias. The people responsible for curating the Google News index make decisions about which websites are okay and which aren’t, and this cannot be a ‘neutral’ and ‘objective’ process because people aren’t neutral and objective. Every news site that is accepted or rejected is done so on the basis of a human decision. As all human decisions are subject to bias – especially unconscious bias – this makes the initial approval process already a subjective one. Secondly, the news site needs to have certain technical elements in place to allow Google News to quickly crawl and index new articles. This includes structured data markup for your articles, and a means of letting Google know you have new articles (usually through a news-specific XML sitemap). Both of these technologies are heavily influenced by Google: schema.org is a joint project from Google, Bing and Yahoo, and the sitemaps protocol is entirely dependent on search engines like Google for its existence. Thirdly, you need to have valid AMP versions of your articles. Some may see this as an optional aspect, but really, without AMP a news site will not appear in Top Stories on mobile search results. This presents such a catastrophic loss of potential search traffic that it’s economically unfeasible for news websites to forego AMP. While AMP is presented as an open source project, in reality the vast majority of its code is written by Google engineers. At last count, over 90% of the AMP code comes from Googlers. So let’s be honest, AMP is a Google project. This gives Google full technical control over Google News and Top Stories – in Google’s own crawling, indexing, and ranking systems, as well as the technologies that news publishers need to adopt to be considered for Google News. Publishers don’t have all that much freedom in designing their tech stack if they want to have any hopes of getting traffic from Google. Ranking in Top Stories The other aspects of ranking in Google News and Top Stories are about the news site’s editorial choices. While historically the Top Stories algorithm has been quite simplistic and easy to manipulate, that’s less the case nowadays. Since the powerful backlash against holocaust denial stories appearing in Google News, the search engine has started putting more resources in its News division, with a newly launched Google News vertical as the result. The algorithms that decide which stories show up in any given Top Stories carousel take a number of aspects in to consideration: Is the article relevant for this query? Is it a recently published or updated article? Is it original content? Is the publisher known to write about this topic? Is the publisher trustworthy and reliable? In Google News there is also a certain amount of personalisation, where Google’s users will see more stories from publishers that they prefer or are seen as geographically relevant (for example because it’s a newspaper local to the story’s focus). And of course, a lot of the rankings of any given news article depend on how well the article has been optimised for search. A classic example is Angelina Jolie’s column for the New York Times about her double mastectomy – if you search for ‘angelina jolie mastectomy‘ her column doesn’t rank at all, and at the time it didn’t appear in any Top Stories carousel. What you see are loads of other articles written about her mastectomy, but the actual column that kicked off the story is nowhere to be found. One look at the article in question should tell you why: it’s entirely unoptimised for the most relevant searches that people might type in to Google. Some journalism purists might argue that tweaking an article’s headline and content for maximum visibility in Google News is a pollution of their craft. Yet journalists seem to have no qualms about optimising headlines for maximum visibility at news stands. News publishers have always tried to grab people’s attention with headlines and introduction text, and doing this for Google News is simply an extension of that practice. Yet even with the best optimised content, news publishers are entirely dependent on Google’s interpretations of their writing. It’s Google’s algorithms that decide if and where an article appears in the Top Stories carousel. Algorithms Are Never Neutral According to Google, the new version of Google News uses artificial intelligence: “The reimagined Google News uses a new set of AI techniques to take a constant flow of information as it hits the web, analyze it in real time and organize it into storylines.” This seems like an attempt at claiming neutrality by virtue of machines making the decisions, not humans. But this doesn’t stand up to scrutiny. All algorithmic evaluations are the result of human decisions. Algorithms are coded by people, and that means they will carry some measure of those people’s own unconscious biases and perceptions. No matter how hard Google tries to make algorithms ‘neutral’, it’s impossible to achieve real neutrality in any algorithm. When Google’s algorithm decides that the story from Site A should appear first in the Top Stories carousel, and a similar story from Site B should be way down at the end of the carousel (or not in there at all), that is the result of countless human decisions – some large, some small – about what constitutes relevancy and trustworthiness. Even with a diverse base of employees from all different backgrounds and walks of life, creating neutral algorithms is immensely challenging. Senior engineers’ decisions will almost always outweigh junior staff’s decisions, and some people’s biases will be represented in those editorial decisions that are made about how an algorithm ranks content. And here Google can be very rightfully accused: it has an incredibly homogenous employee base. Ironically, while Google reports on its employees’ ethnicity and gender, it doesn’t report on political leanings – which is what sparked the furore about their lack of diversity in the first place. So we have no way of really knowing if Google’s engineers come from varied political backgrounds. This leaves Google wide open to criticisms of bias, and it’ll be very hard to dismiss those concerns. Is Google News Biased? To return to the question of bias in Google News, does Donald Trump have a point? The PJ Media article that sparked the controversy is deeply flawed and entirely unrepresentative, but there are other sources that point towards a left-leaning bias in Google News: Yet simply looking at Google News search results and evaluating their diversity of opinion is a dangerous approach, because it fails to look at the underlying dependencies that go in to creating that result in the first place: the technological demands placed on news publishers, the skill of individual journalists to optimise their articles for Google News, and the ability of news organisations to break stories and set the news agenda. And we can’t leave out the fact that Google openly admits to making editorial decisions in Google News. Yes, actual people choosing stories to show up for trending topics. From its own relevant support documentation on curation: The choice of language is very interesting: by using phrases like ’empirical signals’ and ‘algorithmically populated’ Google intends to create the perception that these human curators have no real editorial influence over what is shown in Google News. Yet, even if we accept the notion of Google’s curators being able to make neutral decisions (which we’re not), we know that algorithms are not neutral themselves, and – risking treading on philosophical grounds – there’s no such thing as ’empirical signals’ when it comes to news. Despite its efforts with the Google News Initiative, Google has done little to alleviate legitimate fears of bias in its ranking algorithms. In fact, due to its near full control over the entire process, Google leaves itself very susceptible to accusations of bias in Google News. With Google’s astonishing dominance in search with over 86% worldwide market share, this does beg the question: Can we trust Google? Should we?
Technical SEO Masterclass in London
Since I started delivering the classroom-based technical SEO training in Dublin, there’s been a lot of demand to deliver the training in other locations. One of the most-requested locations is London, and I’ve got good news! In collaboration with the inaugural State of Digital conference, my technical SEO training is coming to London. The date is 9th October 2018 (the day after the State of Digital conference) and tickets for the training are limited. The course content is along the same lines as my previous training days, which were highly rated by the attendees: For websites to perform well, their technical side now matters more than ever, but we know it’s often overwhelming and challenging due to its complexity. To grow your organic search presence, you need to master technical SEO foundations and truly understand search engine processes such as crawling, indexing, structured data, and canonicalisation. This technical SEO masterclass is for digital marketing professionals that have a basic understanding of SEO but want to take it to the next level. The masterclass is suitable for marketers who come from a non-technical background and want to be able to navigate the increasingly complex SEO landscape. If you don’t understand what web developers are saying, or you struggle to deal with concepts such as structured data and crawl budget, this training is for you. Ideally you will have a decent grasp of the basics of SEO, such as keyword research, on-page SEO, and linkbuilding. After completing this training, you will be equipped with the knowledge, insight, and vocabulary to deal with technical SEO challenges. The training consists of the following modules: 1. How Search Engines Work We will begin by explaining how search engines work, and what the three main processes of every search engine are. This module will show where technical SEO fits in to the overall SEO stack and how it relates to search engines. 2. Web Infrastructure Before we can start explaining technical SEO in detail, you first need to have a foundational understanding of the web’s technical underpinnings. Here we will explain internet connectivity, client-server architecture, and basic coding principles. 3. Crawl Optimisation Optimising how search engine spiders crawl your website is the cornerstone of technical SEO. We will explain what crawl optimisation is, why it’s important, and how you can identify and fix crawl issues on your site. We will also discuss JavaScript and how it impacts on search engine crawling and indexing. 4. Load Speed & Mobile SEO Fast-loading websites perform better for every conceivable metric. Optimising your site for speed is an important SEO aspect as well. You will learn how to identify load speed issues and what the best ways are to improve your site’s performance. With more searches performed on mobile devices than on desktop, and Google switching to a mobile-first index, optimising your site for mobile usability is not optional. We’ll show you the various ways you can improve your site’s mobile SEO and achieve success on mobile search. 5. Structured Data Implementing structured data on your site will make your content machine-readable and unlocks a range of benefits. We’ll explain what structured data is, which snippets are most valuable, and how to implement it on your site. 6. Basic Site Security Security issues on your site also impact on SEO and traffic. We’ll show some basic security checks you can perform and provide tools and tips to help make your website more secure. 7. International SEO Sending the right geotargeting signals to Google can be tricky. International SEO signals are often implemented inaccurately, so here we’ll explain what makes for good international SEO and how you can make sure the right version of your content ranks in the right country. 8. The Future of SEO We wrap up the workshop with a look at where search in general – and SEO in particular – is headed, and what technical developments you need to be aware of to prepare yourself for the future. Places are limited so grab your ticket on Eventbrite soon. And since you’ll be there anyway, might as well attend the State of Digital conference the day before! (Note that the conference requires a separate ticket from the masterclass).
Polemic Digital backs Glentoran FC for the 2018/19 season
Polemic Digital’s logo features on the back of Glentoran’s home and away shirts for the 2018/19 season While Polemic Digital works with clients across the globe such as News UK, Seven West Media, Fox News, and Mail Online, we’re a key part of East Belfast’s thriving business community and have been based in the City East Business Centre since our inception. Last month we agreed a partnership with Glentoran FC, the iconic East Belfast football club, to become one of the club’s sponsors with our company’s logo featuring on the back of the players’ 2018/19 shirts. Commenting on this partnership with Glentoran FC, Polemic Digital’s founder Barry Adams said: “The people and businesses of East Belfast have supported and inspired us to take pride in working hard and achieving great results. Glentoran embodies this spirit of teamwork and the will to win against the odds. In our conversations with Simon Wallace we quickly recognised the kindred spirit shared by Glentoran and Polemic Digital, and we’re proud to become part of the club’s long and celebrated history.” Glentoran’s Simon Wallace pictured with Barry Adams from Polemic Digital at The Oval. Simon Wallace, commercial manager at Glentoran FC, added: “It’s great to have a successful local business like Polemic Digital partner with Glentoran and support our club. As a small local firm, Polemic manages to punch above its weight locally and internationally, which mirrors the drive and ambition that Glentoran FC has shown throughout the years.” “We’re looking forward to a great season in the league,” Barry Adams continued. “Glentoran is such an iconic club, we’re fully behind the team and hope for a successful season.” Read more about Polemic Digital here, and visit the Glentoran website at www.glentoran.com.
Polemic Digital wins Best SEO Campaign at the 2018 DANI Awards
On Friday 13 April the 2018 DANI Awards were held in Whitla Hall at Queen’s University Belfast. Since 2018 the DANI Awards have celebrated the great work in digital done in Northern Ireland, and this year was the biggest event yet with more award submissions than ever before. We were up for three awards and, with a shortlist full of great companies and exciting projects, we knew there was going to be tough competition in every category. The one we most looked forward to was of course the Best SEO Campaign award. In an ever-changing industry where many agencies chase after the latest hype, we are unashamedly an SEO-only agency. It’s the one thing we do, and we try to do it as well as it can be done. So we were very happy and honoured to win Best SEO Campaign and take home the prize! It was an evening to celebrate – not only did we win, we also saw many of our friends in the Northern Irish digital industry pick up awards! Huge congratulations to the folks at The Tomorrow Lab, Digital 24, Loud Mouth Media, and Fathom, and especially to Emma Gribben for winning Young Digital Person of the Year! (I interviewed Emma as part of my NI Digital Experts series, read her story here.) We won the award for our work with TheSun.co.uk, and sometimes people ask what makes for an award-winning campaign. How do the judges decide what’s worthy of recognition, and are the winners really deserving of it? I don’t usually share specific client results, but since this was such a great project to work on and has been written about before, I feel it’s okay to share a bit about this project. Since the launch of the new TheSun.co.uk site in 2016, search visibility growth has been astonishing and the site has been going from strength to strength. This is what the Sistrix graph of an award-winning campaign looks like: While I collected the award, it’s really for the combined efforts of everyone involved in SEO at The Sun. They are some of the smartest and most driven people I’ve ever had the privilege of working with. Lately many news sites have had to deal with significant algorithm updates from Google that had a profound impact on the industry. These are the types of challenges that I thrive on. Hopefully we can continue the site’s stellar growth and demonstrate the power of an all-encompassing approach to SEO.
Polemic Digital shortlisted for three 2018 DANI Awards
Since their inception in 2010, the DANI Awards have celebrated the best and brightest of the Northern Irish digital scene. In a previous life, when I was with Pierce Communications, we won two DANI Awards in 2012 for our campaigns for Emo Oil and Total Produce. And in 2014, I achieved a great personal honour by winning Digital Industries Person of the Year at that year’s DANI Awards. Pursuing awards is not something we actively engage in at Polemic Digital. Awards are a dime a dozen, and we have little faith in the majority of them. So far we have chosen to only enter the renowned UK Search Awards, with some measure of success. This year we decided to also enter the DANI Awards, because we felt our client projects were achieving a level of success that warranted recognition. And that gamble has seemed to pay off, as all three of the client projects we’ve entered have been shortlisted! Our work is competing in the following categories: Best SEO Campaign for our work with TheSun.co.uk Best Campaign in Retail for our work with SkirtingsRUs.co.uk Best Campaign in Healthcare for our work with DocklandsDental.ie All three of these websites have achieved considerable SEO success since we started working with them, and to be shortlisted for the 2018 DANI Awards shows that we’re at the forefront of the Northern Irish digital scene. Hopefully on the night itself we’ll come away with some silverware. It’ll be a great event regardless, celebrating the awesome work that’s being done in our wee country. I’m also very pleased to see people and companies we consider friends of Polemic Digital also shortlisted at this year’s awards. Good luck to the folks at The Tomorrow Lab, Digital 24, Loud Mouth Media, and Fathom, and especially to Emma Gribben who’s shortlisted for Best Young Digital Person!
View Source: Why it Still Matters and How to Quickly Compare it to a Rendered DOM
SEOs love to jump on bandwagons. Since the dawn of the industry, SEO practitioners have found hills to die on – from doorway pages to keyword density to PageRank Sculpting to Google Plus. One of the latest hypes has been ‘rendered DOM’; basically, the fully rendered version of a webpage with all client-side code executed. When Google published details about their web rendering service last year, some SEOs were quick to proclaim that only fully rendered pages mattered. In fact, some high profile SEOs went as far as saying that “view source is dead” and that the rendered DOM is the only thing an SEO needs to look at. These people would be wrong, of course. Such proclamations stem from a fundamental ignorance about how search engines work. Yes, the rendered DOM is what Google will eventually use to index a webpage’s content. But the indexer is only part of the search engine. There are other aspects of a search engine that are just as important, and that don’t necessarily look at a webpage’s rendered DOM. One such element is the crawler. This is the first point of contact between a webpage and a search engine. And, guess what, the crawler doesn’t render pages. I’ve explained the difference between crawling and indexing before, so make sure to read that. Due to the popularity of JavaScript and SEO at the moment, there are plenty of smart folks conducting tests to see exactly how putting content in to JavaScript affects crawling, indexing, and ranking. So far we’ve learned that JavaScript can hinder crawling, and that indexing of JS-enabled content is often delayed. So we know the crawler only sees a page’s raw HTML. And I suspect we know that Google has a multilayered indexing approach that first uses a webpage’s raw HTML before it gets around to rendering the page and extracting that version’s content. In a nutshell, a webpage’s raw source code still matters. In fact, it matters a lot. I’ve found it useful to compare a webpage’s raw HTML source code to the fully rendered version. Such a comparison enables me to evaluate the differences and look at any potential issues that might occur with crawling and indexing. For example, there could be some links to deeper pages that are only visible once the page is completely rendered. These links would not be seen by the crawler, so we can expect a delay to the crawling and indexing of those deeper pages. Or we could find that a piece of JavaScript manipulates the DOM and makes changes to the page’s content. For example, I’ve seen comment plugins insert new heading tags on to a page, causing all kinds of on-page issues. So let me show you how I quickly compare a webpage’s raw HTML with the fully rendered version. HTML Source Getting a webpage’s HTML source code is pretty easy: use the ‘view source’ feature in your browser (Ctrl+u in Chrome) to look at a page’s source code – or right-click and select ‘View Source’ – then copy & paste the entire code in to a new text file. Rendered Code Extracting the fully rendered version of a webpage’s code is a bit more work. In Chrome, you can open the browser’s DevTools with the Ctrl+Shift+i shortcut, or right-click and select ‘Inspect Element’. In this view, make sure you’re on the Elements tab. There, right-click on the opening tag of the code, and select Copy > Copy outerHTML. You can then paste this in to a new text file as well. With Chrome DevTools you get the computed DOM as your version of Chrome has rendered it, which may include code manipulations from your plugins and will likely be a different version of Chrome than Google’s render of the page. While Google now has their evergreen rendering engine that uses the latest version of Chrome, it’s unlikely Google will process all the client-side code the same way as your browser does. There are limits on both time and CPU cycles that Google’s rendering of a page can run in to, so your own browser’s version of the rendered code is likely to be different from Google’s. To analyse potential issues with rendering of the code in Google search, you will need the code from the computed DOM as Google’s indexer sees it. For this, you can use Google’s Rich Results Testing tool. This tool renders webpages the same way as Google’s indexer, and has a ‘View Source Code’ button that allows you to see – and copy – the fully rendered HTML: Compare Raw HTML to Rendered HTML To compare the two versions of a webpage’s code, I use Diff Checker. There are other tools available, so use whichever you prefer. I like Diff Checker because it’s free and it visually highlights the differences. Just copy the two versions in to the two Diff Checker fields and click the ‘Find Difference’ button. The output will look like this: In many cases, you’ll get loads of meaningless differences such as removed spaces and closing slashes. To clean things up, you can do a find & replace on the text file where you saved the raw HTML, for example to replace all instances of ‘/>’ with just ‘>’. Then, when you run the comparison again, you’ll get much cleaner output: Now you can easily spot any meaningful differences between the two versions, and evaluate if these differences could cause problems for crawling and indexing. This will highlight where JavaScript or other client-side code has manipulated the page content, and allows you to judge whether those changes will meaningfully impact on the page’s SEO. DirtyMarkup Formatter When you do your first DiffChecker comparison, you’ll quickly find that it’s not always very useful. When a page is rendered by Google, a lot of unnecessary HTML is stripped (such as closing slashes in HTML tags) and a general cleanup of the code happens. Sometimes a webpage’s source code will be minified, which removes all spaces and tabs to save bytes. This leads to big walls of text that can be very hard, if not impossible, to analyse: For this yeason, I always run both the raw HTML and the fully rendered code through the same code cleanup tool. I like to use DirtyMarkup Formatter for this. By running both the HTML source and the rendered DOM through the same cleanup tool, you end up with code on both sides of the comparison that has identical formatting. This then helps with identifying problems when you use Diff Checker to compare the two versions. Comparing two neatly formatted pieces of code is much easier and allows you to quickly focus on areas of the code that are genuinely different – which indicates that either the browser or a piece of client-side code has manipulated the page in some way. Built-In Comparison If all of the above sounds like a lot of manual effort, you’re right. That’s why SEO tool vendors like DeepCrawl, Screaming Frog, and Sitebulb now have built-in comparison features for HTML and rendered versions of each crawled page on your site. I still prefer to look at manual comparisons of key pages for every site that I audit. It’s not that I don’t trust the tools, but there’s a risk in only looking at websites through the lens of SEO tools. Nothing beats proper manual analysis of a webpage when it comes to finding SEO issues and making informed, actionable recommendations.
Polemic Digital shortlisted for two 2017 UK Search Awards
Last year, we entered the UK Search Awards for the very first time in our existence. In a marketplace where award ceremonies are a dime a dozen, the UK Search Awards have always stood out as something special. The judging panel on these awards is second to none, and we knew that our work was going to be judged on merit alone – and not the size of our sponsorship budget. So in 2016, with two and a half years of business under our belt, we more or less wanted to see where we stood in the crowded SEO landscape in the UK. We submitted a few projects to the awards, and were delighted to find ourselves shortlisted in three award categories. We never expected to win that year. After all, we were just a small two-person business in Belfast, and we were competing against some of the UK’s biggest and most established agencies and brands. So when we ended up winning two awards, we were stunned and amazed. Polemic Digital’s 2016 UK Search Awards This year we decided to enter again. While the business has evolved somewhat this last year, focusing primarily in SEO audits, SEO training, and specialised SEO for news publishers, we had a few ongoing projects we were proud of and perhaps the judges might consider favourably. So when last week the shortlist was announced, we were eager to see if we’d made the cut. And indeed, we did! Polemic Digital is shortlisted in two categories: Best Use of Search – Retail Best Small SEO Agency While we won last year’s Best Small SEO Agency award, in which our tiny two-person agency went up against outfits that had up to 25 members of staff, we feel our chances to extend our winning streak are quite small; every year the competition gets tougher, with more companies submitting more projects to the awards. This year, the shortlist boasts a truly outstanding selection of agencies and projects. Still, even if we leave empty-handed, the awards night on November 30 in London will be another superb event celebrating all that is awesome about the search industry in the UK. Our local friends at Loud Mouth Media are once again on the shortlist, continuing their success as Northern Ireland’s finest PPC agency. And many of our agency friends in the UK, such as Marketing Signals, Branded3, Verve Search, MediaVision, BlueGlass, 10 Yetis, Screaming Frog, and many more are also shortlisted for awards. So it’ll be an amazing night, no matter what. Just to be shortlisted among the UK’s biggest and best is all we ever wanted, so we’re already considering this mission accomplished! Update: we didn’t win any awards but had a great night nonetheless. Congratulations to all the winners, well-deserved!
Technical SEO Training
Update: My technical SEO training course will now be delivered online, read more about it here. Over the years I’ve delivered dozens of SEO training sessions for all kinds of clients, from technical teams in big corporations to government agencies’ communication departments. These sessions have always been made bespoke for the client in question and specifically crafted to cater to their needs and requirements. Recently I’ve been contacted by many people asking if I also offer open classroom based courses that anyone can sign up to. These are ambitious individuals working on SEO challenges in their own organisations, and they want to improve their skills and knowledge to become better at their jobs. Up to now I haven’t been in a position to offer those sorts of courses. But, here’s the good news: I’ve partnered with the awesome people at Learn Inbound in Dublin and now offer my first SEO training course that anyone can sign up to. This is a Technical SEO Course for marketers from a non-technical background, developers that want to get better at building SEO-friendly websites, and anyone that wants to learn the foundational aspects of good search engine optimisation. My technical SEO course is made up of eight separate modules: 1. How Search Engines Work I will begin by explaining how search engines work, and what the three main processes of every search engine are. This module will show where technical SEO fits in to the overall SEO stack and how it relates to search engines. 2. Web Infrastructure Before I can start explaining technical SEO in detail, you first need to have an understanding of the web’s technical underpinnings. Here I will explain internet connectivity, client-server architecture, and basic coding principles. 3. Crawl Optimisation Optimising how search engine spiders crawl your website is the cornerstone of technical SEO. We will explain what crawl optimisation is, why it’s important, and how you can identify and fix crawl issues on your site. I will also discuss JavaScript and how it impacts on search engine crawling and indexing. 4. Load Speed & Mobile SEO Fast-loading websites perform better for every conceivable metric. Optimising your site for speed is an important SEO aspect as well. You will learn how to identify load speed issues and what the best ways are to improve your site’s performance. With more searches performed on mobile devices than on desktop, and Google switching to a mobile-first index, optimising your site for mobile usability is not optional. I’ll show you the various ways you can improve your site’s mobile SEO and achieve success on mobile search. 5. Structured Data Implementing structured data on your site will make your content machine-readable and unlocks a range of benefits. I’ll explain what structured data is, which snippets are most valuable, and how to implement it on your site. 6. JavaScript & SEO The prevalence of JavaScript frameworks presents a new challenge for SEO. In this module I’ll explain what to look for when analysing JavaScript-based websites for SEO and how to ensure such sites cna be properly crawled and indexed by Google. 7.International SEO Sending the right geo-targeting signals to Google can be tricky. International SEO signals are often implemented inaccurately, so here we’ll explain what makes for good international SEO and how you can make sure the right version of your content ranks in the right country. 8. The Future of SEO We wrap up the day with a look at where search in general – and SEO in particular – is headed, and what technical developments you need to be aware of to prepare yourself for the future After completing this training, you will be equipped with the knowledge, insight, and vocabulary to deal with technical SEO challenges. I guarantee that after taking this course you will be a more effective SEO and be able to tackle those technical challenges in your job head on. If you’ve any questions about this training or any of my other services, you should get in touch.
Prevent Google From Indexing Your WordPress Admin Folder With X-Robots-Tag
I recently wrote an article for State of Digital where I lamented the default security features in WordPress. Since it is such a popular content management system, WordPress is targeted by hackers more than any other website platform. WordPress websites are subjected to hacking attempts every single day. According to Wordfence’s March 2017 attack report, there were over 32 million attempted brute force attacks against WordPress sites in that month alone. Out of the box, WordPress has some severe security flaws leaving it vulnerable to brute force attacks. One of these flaws is how WordPress prevents search engines like Google from crawling back-end administration files: through a simple robots.txt disallow rule. User-agent: * Disallow: /wp-admin/ While at first glance this may seem perfectly sensible, it is in fact a terrible solution. There are two major issues with the robots.txt disallow rule: Because a website’s robots.txt file is publicly viewable, a disallow rule points hackers to your login folder. A disallow rule doesn’t actually prevent search engines from showing blocked pages in its search results. I don’t recommend using robots.txt blocking as a method to protect secure login folders. Instead there are other, more elegant ways of ensuring your admin folders are secure and cannot be crawled and indexed by search engines. X-Robots-Tag HTTP Header In the context of SEO, the most common HTTP headers people have heard of are the HTTP status code and the User-Agent header. But there are other HTTP headers which can be utilised by clever SEOs and web developers to optimise how search engines interact with a website, such as Cache-Control headers and the X-Robots-Tag header. The X-Robots-Tag is a HTTP header that informs search engine crawlers (‘robots’) how they should treat the page being requested. It’s this tag that can be used as a very effective way to prevent login folders and other sensitive information from being shown in Google’s search results. Search engines like Google support the X-Robots-Tag HTTP header and will comply to the directives given by this header. The directives the X-Robots-Tag header can provide are almost identical to the directives enabled by the meta robots tag. But, contrary to the meta robots tag, the X-Robots-Tag header doesn’t require the inclusion of an HTML meta tag on every affected page on your site. Additionally, you can configure the X-Robots-Tag HTTP header to work for files where you can’t include a meta tag, such as PDF files and Word documents. With a few simple lines of text in your website’s Apache htaccess configuration file, we can prevent search engines from including sensitive pages and folders in its search results. For example, With the following lines of text in the website’s htaccess file, we can prevent all PDF and Word document files from being indexed by Google: Header set X-Robots-Tag "noindex, nofollow" It’s always a good idea to configure your website this way, to prevent potentially sensitive documents from appearing in Google’s search results. The question is, can we use the X-Robots-Tag header to protect a WordPress website’s admin folder? X-Robots-Tag and /wp-admin The X-Robots-Tag doesn’t allow us to protect entire folders in one go. Unfortunately, due to Apache htaccess restrictions, the header only triggers on rules applying to file types and not for entire folders on your site. Yet, because all of WordPress’s back-end functionality exists within the /wp-admin folder (or whichever folder you may have changed that to) we can create a separate htaccess file for that folder to ensure the X-Robots-Tag HTTP header to all webpages in that folder. All we need to do is create a new htaccess file containing the following rule: Header set X-Robots-Tag "noindex, nofollow" We then use our preferred FTP programme to upload this .htaccess file to the /wp-admin folder, and voila. Every page in the /wp-admin section will now serve the X-Robots-Tag HTTP header with the ‘noindex, nofollow’ directives. This will ensure the WordPress admin pages will never be indexed by search engines. You can also upload such an htaccess file configured to serve X-Robots-Tag headers to any folder on your website that you want to protect this way. For example, you might have a folder where you store sensitive documents you want to share with specific 3rd parties, but don’t want search engines to see. Or if you run a different CMS, you can use this to protect that system’s back-end folders from getting indexed. To check whether a page on your site serves the X-Robots-Tag HTTP header, you can use a browser plugin like HTTP Header Spy [Firefox] or Ayima Redirect Path [Chrome], which will show you a webpage’s full HTTP response. I would strongly recommend you check several different types of pages on your site after you’ve implemented the X-Robots-Tag HTTP header, because a small error can result in every page on your website serving that header. And that would be a Bad Thing. To check if Google has indexed webpages on your site in the /wp-admin folder, you can do a search with advanced operators like this: site:website.com inurl:wp-admin This will then give a search result listing all pages on website.com that have ‘wp-admin’ anywhere in the URL. If all is well, you should get zero results: The X-Robots-Tag HTTP header is a simple and more robust approach to secure your WordPress login folders, and can also help optimise how search engines crawl and index your webpages. While it adds to your security, it’s by no means the only thing you need to do to secure your site. Always make sure you have plenty of security measures in place – such as basic authentication in addition to your CMS login – and install a plugin like Wordfence or Sucuri to add extra layers of protection. If you liked this post, please share it on social media. You might also like to read this post about protecting your staging environments.