-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obfuscation #1873
Comments
Reading system requirements are specified in the Reading Systems specification. In this case, they are required to reverse the process to deobfuscate: https://w3c.github.io/epub-specs/epub33/rs/#sec-container-res-obfus |
Yes, I did eventually figure that out! Thanks. I'm not clear on whether the user agent/reading system is supposed to not provide the de-obfuscated file to the user, or if that's just a requirement that comes externally from the vendor or the DRM provider and not part of the spec. |
This looks like a bad porting of the original algorithm specified in: http://idpf.org/epub/20/spec/FontManglingSpec_2.0.1_draft.htm That document is contradictory on this point (sigh), but it says in the Obfuscation Algorithm section:
The "should" comes later but it doesn't make sense how it couldn't also be a must. It appears when it was integrated in the original 3.0 revision that the "must" was dropped (it's no longer limited to 20 bytes) and the later should retained. But I agree that makes no sense since if it you can't know how to create the key, you can't know how to deobfuscate. Assuming we keep the section, it definitely needs correcting. |
The reading system will deobfuscate and use the font, but I'd assume most apps, at least, do that in memory and don't write it out to disk, if that's what you're asking. (I don't write reading systems, so that's just my understanding based on how other resources have been stated to be handled; maybe someone else will correct me.) The user typically isn't going to have any way from within the reading system to access the deobfuscated source regardless of how it's done, though, just as they can't access any other resources. If the reading system is running in a browser, my guess would be you might be able to access the font (but maybe only as a blob url?). I don't think preventing access completely is realistic in this situation. But obfuscation was always meant to be trivial, so the requirements have been pretty thin. It's tacitly understood that if the user wants at the font, and they have access to the epub file, they'll be able to get it. |
The issue was discussed in a meeting on 2021-10-26
View the transcript1.2. DRM and obfuscationSee github issue #1873, #1874. Wendy Reid: next, DRM and obfuscation. Dave Cramer: not aware of real world use of obfuscation aside from fonts. Nick Doty: the obfuscation can be undone easily though. Dave Cramer: some font vendors have told me that even these very ineffective means are good because then they can say that if you work around them then you violated DMCA etc..
Wendy Reid: obfuscation tends to break things in RS.
Wendy Reid: DRM is tricky because the spec does not specify the DRM to be applied. Brady Duga: i think you can only obfuscate fonts.
Rick Johnson: my opinion is that we shouldn't address this in spec. Matt Garrish: we've never wanted to go into DRM implementation in spec. Samuel Weiler: but you have the have the hooks for it in the spec, even if you aren't fully specifying the DRM.
Tzviya Siegman: epub exists in an ecosystem that has been around for a long time. If we took out those hook we would be ceding our standard to a world that would not accept the lack of it. Wendy Reid: we wouldn't have the support of most of the publishing industry, RS would be happy, and retailers would also be impacted.
Dave Cramer: I might disagree. If we took this out of the spec, would this change anything in practice?. Matt Garrish: i agree with dauwhe. It doesn't break anything if we take this out.. Wendy Reid: okay, good points everyone. This is something we need to assess as part of our privacy threat model. |
The issue was discussed in a meeting on 2021-10-28
View the transcript4.3. Obfuscation (issue epub-specs#1873)See github issue epub-specs#1873. Dave Cramer: another thing PING brought up: obfuscation. Matt Garrish: it would be nice if we didn't have this, but now that we're 10 years in, not sure what we can do about it. Brady Duga: we can't remove it; we could recommend using WOFF.
Theresa O'Connor: clarification on obfuscation?. Dave Cramer: how to keep fonts from escaping the EPUB to protect font vendors.
Dan Lazin: keeping in mind that Adobe is a main exporter of custom ebooks, they have had a strong interest in both DRM and protecting their fonts.
|
If the obfuscation is trivial, then I'm not sure what value it's providing. If it's there for historical or backcompat only, it could be deprecated and warnings put in place to minimize any future harm. Perhaps the purpose (as was mentioned at the joint group call) is to make it easier to sue or threaten criminal consequences for anyone who writes a simple script to de-obfuscate or any vendor that implements a Reading System that happens to save the file to disk. If so, that seems inconsistent with ethical Web principles. Also, if the purpose is to enhance legal threats, we should probably document that risk somewhere: I don't want someone getting sued because they implemented -- or wrote tests for! -- the Reading System specification. |
Trivial is still going to block most non-technical self publishers from being able to take the font and drop it in their own book, for example. (I'd hope professional publishers would know better.) It provides a measure of defence for the font vendor. I don't think it matters much legally whether you took an unobfuscated font directly or you figured out how to reverse the obfuscation, but IANAL. You're violating the font licensing agreement by reusing it without paying for it. Whether the user agent assumes any risk by allowing access to the unobfuscated version isn't something I can answer, either. The theft isn't in deobfuscating but in reusing without a license, so I would expect not. There's been some discussion about the origins of this and whether it's still needed in the group's email list starting here: https://lists.w3.org/Archives/Public/public-epub-wg/2021Oct/0025.html |
As I noted in the meeting minutes above, I think it's still "needed" because most (commercial) EPUBs are exported from InDesign and Adobe cares a lot about font copyright. I would guess that Adobe would not agree to removing obfuscation, and practically speaking we would need them to. |
This isn't theft, but potential copyright infringement, to be clear. Obfuscation doesn't only make it a little more difficult for someone to copy a font into another publication that they would sell without permission (a clear case of copyright infringement), but also often breaks epub files when they're edited on a user's device. I am also not a lawyer, but anti-circumvention provisions in the DMCA and other laws around the world do make it particularly risky to produce (or distribute or market) de-obfuscation tools, even if you never use it or intend it for copyright infringement. More background here: https://en.wikipedia.org/wiki/Anti-circumvention
This is super useful context, thank you! It also recommends a clear way forward, that WOFF or some subsetting proposals could make obfuscation (and the legal risks of de-obfuscation) unnecessary. Also, if obfuscation is only ever used for font files, that would be a useful limitation to note. Many of the effects (for privacy, accessibility, etc.) would be less severe if the only obfuscated files are ones that don't include contents of the text, active scripts or references to external resources. |
We will propose to restrict obfuscation to only fonts, and we can enforce this via EPUBCheck. We can't remove obfuscation entirely as the feature is widely used. Adobe InDesign does this. Forbidding this would break thousands and thousands of existing books. |
The situation is quite different in Japan.
I'm not against above Dave's comments. |
I would certainly recommend requiring limiting this obfuscation technique to only where it's already being used. Can it also be marked as a deprecated technique, with clear alternatives (to WOFF or something else) to move to something better? If it's known that this feature is generally bad for users and authors and reading systems but is included for backwards compatibility, then we should be able to note it as deprecated and provide better methods going forward. |
We could add a caution note that obfuscation should be avoided, but our hands are kind of tied when it comes to formally deprecating practices that have adoption. Deprecating leads to warnings in validation, which leads to content being rejected by vendors, which leads to angry publishers. It's formally in our charter that we not deprecate features that are relied on by publishers. |
Agreed with Matt, but can someone clarify for me (FYI, Nick, I am pretty new here) whether we think that anyone is using obfuscation for anything other than fonts? Do RSes support obfuscation for anything other than fonts? I have not seen it anywhere other than fonts, but I am pretty new. I would be in favor of saying "obfuscation is used for fonts, but you could and maybe should (?) use WOFF etc instead, and although obfuscation could theoretically be used for other resources, in practice no reading systems support it for anything other than fonts" ... or whatever is actually true. In short, no need to formally deprecate, but we should document the practical state of the world and encourage WOFF for people who can use it. |
I am not aware of usage outside of fonts. I think we should forbid usage outside of fonts. |
+1 to Dave - this used to be called "Font Obfuscation", which pretty
clearly tied it to fonts. I think (though am not certain) that loosening
this to other content types was an oversight, not an intentional feature.
…On Thu, Nov 11, 2021 at 9:09 AM Dave Cramer ***@***.***> wrote:
Agreed with Matt, but can someone clarify for me (FYI, Nick, I am pretty
new here) whether we think that *anyone* is using obfuscation for
anything other than fonts? Do RSes support obfuscation for anything other
than fonts? I have not seen it anywhere other than fonts, but I am pretty
new.
I am not aware of usage outside of fonts. I think we should forbid usage
outside of fonts.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1873 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA246ZGCIPRM555X5DNEJMTULP2D3ANCNFSM5GYIBJXQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I didn't realize this! I suppose it depends whether privacy or compatibility issues qualify as "serious issues (such as a security bug)". It would be useful for future requests for reviews if you could let the reviewers know whether the charter prohibits making any changes to address issues the reviewers might raise. |
I think it was something in-between. I remember us discussing the change, but I can't find much about why. It appears it was done in 3.0.1 at the same time we defined the compression order, as I did find this in some old minutes:
I'm pretty sure it wasn't done to enable obfuscation for a specific other case, though. I believe it was only because there was nothing in the section that required it to be used for fonts, so we were only making the section reflect that it could be used for other things. |
Ya, sorry, we've just come to accept this limitation. We tried some radical changes to EPUB in the 3.1 revision, and then had to undo a lot of the work in 3.2 when publishers balked at implementing the specification. That's how it ended up in our charter. We'd probably have to reduce the use of obfuscation to near zero before we could deprecate, otherwise a similar cycle will play out where the specification is ignored, or certainly that part. A caution note could say that we intend to deprecate the feature in the future, which would at least give the community fair warning to look at the alternatives. The other option would be to look at making a note out of obfuscation, encryption.xml, and rights.xml. Obfuscation began life as a note in IDPF, after all. It wouldn't change anything as far as publishers being able to implement obfuscation and drm, but perhaps helps avoid enshrining details in a standard. |
The issue was discussed in a meeting on 2021-11-11 List of resolutions:
View the transcript1. Obfuscation (issue epub-specs#1873)See github issue epub-specs#1873. Dave Cramer: from PING horizontal review, there were questions about obfuscation. Brady Duga: WOFF also can't be copied for use on your system. Shinya Takami (高見真也): in JP in many cases we don't have encryption in RS. So this won't have a big impact on JP market. It won't be a problem.. Matt Garrish: re. limiting to fonts, how would we do this? List a set of font formats?. Dave Cramer: i was thinking we would limit to font core media type. Matt Garrish: that covers the widely used ones, but not sure if there's anything else out there. Brady Duga: I wish we could just restrict the list of fonts to the core list. Dave Cramer: right, what about postscript type1 fonts. Brady Duga: can we just non-normatively note what the intention is?.
Matt Garrish: i wonder if there's some way of tying this to how fonts are declared. Dave Cramer: okay, that might be cleaner way of getting to the result we want. Brady Duga: it feels like that would be hard to do, e.g. chemML which has its own way of referencing fonts. Dave Cramer: the proposed solution would satisfy the vast majority of cases though, most people would be happy with it.... Brady Duga: the problem is that the media type for fonts isn't well defined, there could be epubs out there using weird fonts. Matt Garrish: i think epubcheck already has some sort of internal list of font types, but we'd need them to confirm. Brady Duga: changes like this push vendors towards moving away from using epubcheck as part of their ingestion pipeline. |
To follow up from the meeting last night, I dug into epubcheck and there is a list of pattern matches for fonts that covers the CMTs for remote fonts:
There's a similar check for EPUB 2, so it's probably safe to assume that using the CMT list as a basis for restricting obfuscation will probably cover the vast majority of what's out there. If other font formats are in use, then I'd imagine those folks aren't bothering with epubcheck and whatever restrictions we place here aren't going to matter to them anyway. |
I believe the original issues here have been covered as fully as we can:
|
@npdoty is it o.k. to close this issue now? |
I think it would help to explain the harms of the font obfuscation technique, in addition to the pointers to better alternatives. (Obfuscation breaks compatibility and interoperability of EPUB files, creates opacity for end users inspecting the files they're reading and introduces complexity and potential legal liability for reading system developers.) We might also include a warning (in the RS spec) to reading system developers of the potential legal threats if they provide de-obfuscation or access to fonts. |
I wonder if we should better explain the limitations of font obfuscation on the authoring side so that it fully removes any expectation that reading systems have any obligation to keep the obfuscated font secure. The key sentence in the introduction is this:
The only expectation is that it will help prevent trivial copying out of one container and into another, but this may be something we take for granted. Perhaps we can list ways that obfuscation does not protect the content from copying to better remove any misunderstanding (e.g., that users may be able to gain access to the unobfuscated font through their reading system). There shouldn't be a threat to reading system developers from using obfuscated fonts. The primary point of concern is between the author and the font vendor -- namely, that the vendor agrees that obfuscation is sufficient protection if that vendor isn't the one protecting the resource. |
My understanding is that the DMCA has been used as a legal threat against those distributing open source software that allows for de-obfuscation and saving of font files, and that that threat could also be levied against any reading system that saves the de-obfuscated font file. |
Going back to @toshiakikoike's comment, should deobfuscation support be a recommendation and not a requirement? If there are already reading systems ignoring obfuscated fonts, it would be contradictory to compel reading systems to support deobfuscation. You must deobfuscate the font even if you don't use it? |
Obfuscation in EPUB has been around since 2008 or so. I'm not aware of litigation around this, or threats of litigation. By "saving the de-obfuscated file" do you mean making the font easily available to the end user in its original form? |
I agree that the obfuscation algorithm is not challenging for a programmer to bypass and that the code is publicly available (as is the algorithm). I believe the DMCA doesn't require protections to be especially strong for it to be illegal to provide circumvention tools.
I should be more precise here. I don't know for certain that a particular DMCA complaint has been filed, I've just observed someone posting a link to a github repo for a tool that does de-obfuscation and then the link being broken / code not being available. (My recollection was that there was a reference to DMCA or to a legal issue, but if there was, I don't have a link handy any more.)
Yes, that's what I mean. |
There's a very recent case where a GitHub repo that had code to completely remove DRM from an ebook was taken down via a DMCA notice. |
The issue was discussed in a meeting on 2022-02-03 List of resolutions:
View the transcript2. Updates to Obfuscation.See github pull request epub-specs#1980. See github issue epub-specs#1873. Dave Cramer: this is the PR. There's a lot of discussion in the related issue.. Brady Duga: i'm skeptical of even mentioning legal issues without explicit guidance from lawyers. Dave Cramer: i share this concern. Matt Garrish: yeah, i struggled to come up with a caution that was meaningful. Dave Cramer: and some of the other limitations are legitimate.
Brady Duga: i'm fine with the general caution. Dave Cramer: mgarrish can you just remove the legal reference?. Matt Garrish: yes. Brady Duga: one other language issue about "designed to break the obfuscation". Wendy Reid: "deobfuscate". Brady Duga: "intentionally make available"?. Matt Garrish: agree. Brady Duga: fine with having it as SHOULD support deobfuscation. Dave Cramer: fine with leaving it at SHOULD, this is not a core feature.
|
The issue was discussed in a meeting on 2022-04-08 List of resolutions:
View the transcript1. Close Privacy & Security Issues.Dave Cramer: the TAG has reappeared of making a couple comments, I am making a PR to mention that when using web APIs, which have the most dramatic privacy and security implications (geolocations, push notifications) then you should get user consent. See github issue epub-specs#1959. Dave Cramer: we have several issues where there was never much discussion in the issue (#1959 for example). Ivan Herman: we had a lot of discussion with PING, good discussions, after which we made extensive additions to answer the issues they raised. Gregorio Pellegrino: so is this passed? it is okay? See github issue epub-specs#1872. Ivan Herman: yes, it is okay. Dave Cramer: risk of exposure and finger printability. See github issue epub-specs#1873. Dave Cramer: obfuscation, which we've discussed extensively, followed by updates to the spec docs. See github issue epub-specs#1875. See github issue epub-specs#1876. Dave Cramer: interactivity, which we've addressed as best we can given that it's ambiguous. See github issue epub-specs#1957. Dave Cramer: we enumerated the threat model, which deals with #1957. See github issue epub-specs#1958. Dave Cramer: permission prompts, we're dealing with this, strengthened text. See github issue epub-specs#1959.
Dave Cramer: broad user expectations issues, which is covered by the other changes we've made.
Dave Cramer: I think the spec is now much more informative/clear about some of these issues, so thanks everyone.
|
From the PING review:
The text was updated successfully, but these errors were encountered: