Bookmarked Remove HTML In Google Sheets Cells (Stack Overflow)

Trying to determine the best method to automatically remove HTML in all cells within a column in Google Sheets.

Example of cell data:

<span style=”color:#0000FF”>test</span>
I’d li…

After completing Ben Collins’ REGEX course, I wondered if it were possible to replace HTML with Markdown. I found the following REGEX formula for stripping out HTML:

=REGEXREPLACE(A1,"<[^<>]+>","")

I also found a ‘dirty’ converter code to run as a script.

However, I also found this post explaining why REGEX is not designed for parsing HTML:

Entire HTML parsing is not possible with regular expressions, since it depends on matching the opening and the closing tag which is not possible with regexps.

Regular expressions can only match regular languages but HTML is a context-free language and not a regular language (As @StefanPochmann pointed out, regular languages are also context-free, so context-free doesn’t necessarily mean not regular). The only thing you can do with regexps on HTML is heuristics but that will not work on every condition. It should be possible to present a HTML file that will be matched wrongly by any regular expression.

Liked The unreasonable effectiveness of simple HTML by @edent (shkspr.mobi)

I’ve told this story at conferences – but due to the general situation I thought I’d retell it here.
A few years ago I was doing policy research in a housing benefits office in London. They are singularly unlovely places. The walls are brightened up with posters offering helpful services for p…

Liked Stop solving problems you don’t yet have by Rachel Andrew (rachelandrew.co.uk)

It’s a confusing world of options out there to the beginner and learning the basics of HTML and CSS development for modern browsers, then solving the issues that come up, is still the best grounding for any new developer.This situation is exactly what we asked for in the early days of the Web Stan…

Liked reveal.js – The HTML Presentation Framework by Hakim El Hattab (github.com)

reveal.js comes with a broad range of features including nested slides, Markdown contents, PDF export, speaker notes and a JavaScript API. There’s also a fully featured visual editor and platform for sharing reveal.js presentations at slides.com.

Bookmarked Dear Developer, The Web Isn’t About You by Charlie Owen (sonniesedge.co.uk)

We need to keep that beauty and weirdness going that first came with the early web.

Because the webs beauty comes from its diversity. A diversity of tech, and a diversity of people.

We’re the enablers and the defenders of that diversity.

So let’s not make it about us. Let’s make it about the wonderfulness of the Weird Wild Web.

In this presentation, Charlie Owen provides a history of the stupid web and argues that we need to return to a beauty and weirdness found in the early web. This beauty and weirdness involves recognising that not everyone is alike, rich, well-connected or able-bodied. At the heart of this is returning HTML to the base of the design pyramid, as opposed to JavaScript. This reminds me in part of a post from Kicks Condor discussing the need for more design.

Much of this is beyond me at this point in time. However, I wonder if WordPress is a part of this problem, rather than a solution? I was really interested in the discussion of Cutting the Mustard (CTM) and wonder what this might look like on my own site(s). At the very least I was left thinking that I probably don’t empathise enough.

Marginalia

The Web is incredible. It’s incredible because it’s stupid. It’s a collection of very stupid, or more accurately, very simple, technologies, all chained together to make something much greater.

The Pyramid of Robustness (Β© C Owen Enterprises Ltd) was a thing that we cared about. We put the things that were the most solid and reliable at the bottom of the pyramid – in this case server-generated HTML. We then added on a presentation layer (CSS), and then an interaction layer (JavaScript).

We have got to the point where sites require ~2.5 megabytes to download, and the average content-based webpage is now bigger than a copy of Doom (a fully-fledged 3D shooter game) … Most of this size is due to sites not offering srcset variants on their images, and not taking the time to optimise images on those that they do offer. Some of it is due to third-party tracking, advertising, and marketing scripts (marketeers may well be the most script-heavy people in any organisation). A lot of it (but not most, by any means) is due to JS application bundles and third party scripts used to run a page (such as jQuery – still a major force on most of the web).

Yes, it’s technically amazing to build your 747, or have your JS build a content page, but it’s utterly over-engineered and impractical for most occasions. I’m laying it out here – I’m marking my line in the sand: JavaScript only when there’s no other choice. It shouldn’t be the first port of call for building a site.

If we want to make the web better for people then the most important thing that we can do is to learn the basics. Not of technology, but of our fellow humans. Because, as we’ve show earlier, empathy is the most important skill that a developer can have. Our job is 100% about people, about our fellow humans. How can we do an amazing job for them if we don’t understand who we are building for?

So how do you combine 100% universality with the fact that some people have ancient, terrible browsers that it would be a time-sink to support? CTM gives the answer! Only those browsers that are “good enough” receive the advanced features. Those that have poor technology support silently fail the test and receive the core version. No having to support ancient browsers!

via Greg McVerry