The goal was to store text (and images) in a gDoc. Cal the gDoc, grab the stuff and put it into an email The format for the gDoc was going to be with xml, a very friendly way to store data in text :
---- gDoc File content --
<subject> BRICs Course Offerings 2023</subject>
<attachments> {a file reference somehow} </attachments>
<body>
Hi - This email provides our Course offerings for 2023.
Please support our Work by purchasing just 5 hours of your CE credits from us each renewal period.
Regards
Bryan and Ric (with signature images, preferably)
</body>
---------------------------
This should be a no brainer for regEx, right? Not only not at all... not f--ing at all...
I should have been able to do something like this in JavaScript / google script
pattern = / <body>(.+)</body> /
arrMatches = gDocText.match(pattern)
And I should have gotten just
"BRICs Course Offerings 2023" in the arrMatches[0] position
The parenthesis should have acted as a "capture" and the .+ should have acted as a wild card for all characters that exist until you find the </body> tag. Simple stuff. is not so simple....
Problem 1 - the forward slash as bookends - the Forward slash is the JavaScript delimiter on both ends so obviously it can't show up in the pattern match without an "escape" somehow, BUT with regEx the / slash is not a charcter that requires escape? if you do it, Google Script processer will accept the escape and compile but the match then never seemed to work ... Why not change the start and end pattern stuff somethin g crazy and/or customizable? even better, thus nothing is ever a problem...
objRegex.pattern_bookends = "//$//"
objRegex.pattern = //$// <body>(.+)</body> //$//
{what a novel concept... then the backside code just uses a split command and position 1 for the pattern. and the bookends are never problematic again}
Problem 2 - the capture didn't work - The ( ) brackets have several uses. one was/is supposed to be a subset capture somehow. In many applications the goal is to get info that is surrounded with known info that you do NOT need. You can always add that back if you need that?! Why is there no perfect and always functional capture system for a subset of the match?! I can add back on stuff as needed?!\
Problem 3 - Line feed and carraige returns -- I struggled with LF and cariage returns for a bit when I had the tags on their own lines. I finally did something like .\S\s to get it all. Why not put "commas" in to separate the stuff with a simple comma escape system like the double quotes system... just for readability...
=====================================================================
I spentbetter part of an hour or two trying to get this to work before having to go an entirely different direction. This isn't a useful system.
I ended up having to use <subject> <end_subject>, grab an entire string and then remove the tags. Truthfully having to remove the tags is not a big deal but then I had to contend with white space (line feeds) too. unsure if any trim would have worked easily The capture of a lesser set than the full search criteria seems like it should be a no brainer and is not...