00:00:00:00 - 00:00:03:29 Thanks and welcome to the Community Contribution Series presentation today. 00:00:04:06 - 00:00:05:16 We're going to start by discussing 00:00:05:16 - 00:00:09:24 my Python based library for encoding and decoding OSCAL models. 00:00:09:24 - 00:00:12:18 Before we dive in, let's start with some background. 00:00:12:18 - 00:00:15:18 OSCAL Pydantic is available from GitHub. 00:00:15:19 - 00:00:20:19 It is a Python API for creating and manipulating OSCAL data models. 00:00:20:19 - 00:00:24:28 It provides a high fidelity schema for all of the data elements. 00:00:25:00 - 00:00:30:02 The objective is to develop the reference implementation of OSCAL in Python. 00:00:30:13 - 00:00:33:07 We don't have reference implementations, but 00:00:33:07 - 00:00:36:11 my goal was to make it so good that nobody needs to write another book. 00:00:36:11 - 00:00:37:27 V1 is available today. 00:00:37:27 - 00:00:41:27 if you need a Python library to do OSCAL validation, 00:00:41:27 - 00:00:45:05 you can PIV install OSCAL Pydantic and you will have it. 00:00:45:05 - 00:00:48:09 However, that's not the whole story, so let's get into the rest of the story. 00:00:48:14 - 00:00:50:07 Building blocks, of course. 00:00:50:07 - 00:00:52:21 OSCAL-Pydantic uses Python. 00:00:52:21 - 00:00:54:01 Very popular language. 00:00:54:01 - 00:00:58:05 it is built on top of the pydantic data validation library. 00:00:58:10 - 00:01:02:09 We'll talk a little bit more in detail as as we get into the appropriate material. 00:01:02:20 - 00:01:05:26 it provides validation of the OSCAL schema, 00:01:05:29 - 00:01:10:22 which is built on top of Nest meta schema modeling framework. 00:01:10:22 - 00:01:12:20 that becomes important in a couple of places, 00:01:12:20 - 00:01:13:15 but if you're not familiar 00:01:13:15 - 00:01:16:15 with meta schema, well, you'll you'll learn a little bit today. 00:01:16:21 - 00:01:20:06 let us start by discussing OSCAL Pydantic v1. 00:01:20:18 - 00:01:24:04 First I'd like to start by thanking the Compliance Trestle project 00:01:24:04 - 00:01:25:20 from IBM for inspiration. 00:01:25:20 - 00:01:29:15 I use the word inspiration because stealing is such an ugly word. 00:01:30:04 - 00:01:33:23 But when I was first developing OSCAL tooling for Python, 00:01:33:23 - 00:01:39:03 I imported the entire compliance Trestle package just to get their library. 00:01:39:12 - 00:01:42:12 to be able to build OSCAL models in Python. 00:01:42:16 - 00:01:46:01 Compliance Trestle is built on top of other open source software. 00:01:46:06 - 00:01:51:19 Particularly important, for our purposes, is a tool called data model code generator. 00:01:51:19 - 00:01:56:13 The tool basically takes Json schema and converts it into models 00:01:56:22 - 00:01:59:28 that use the Pydantic library in the Python language. 00:01:59:28 - 00:02:03:27 However, data model code generator is a little naive and simplistic 00:02:03:27 - 00:02:07:15 about how it does the conversion, so some hand tweaking was required. 00:02:07:18 - 00:02:10:28 First of all, Json supports two independent string 00:02:10:28 - 00:02:13:28 format specifications for individual elements. 00:02:13:28 - 00:02:15:25 It's something called a format 00:02:15:25 - 00:02:19:28 which the documentation describes as basic semantic identification 00:02:19:28 - 00:02:23:14 of certain kinds of string values that are commonly used. 00:02:23:18 - 00:02:27:07 But it also provides a pattern which is a simple regular expression. 00:02:27:13 - 00:02:30:14 The interaction between format and pattern in the Json 00:02:30:14 - 00:02:34:04 schema is undefined, and as far as I've been able to determine. 00:02:34:16 - 00:02:37:26 So if you define a format, are you allowed to define a pattern as well? 00:02:38:03 - 00:02:41:01 What if the define pattern conflicts with the format? 00:02:41:01 - 00:02:42:24 So it's all a mystery, I don't know. 00:02:42:24 - 00:02:48:06 So, Pydantic has the concept of formats and regular expressions 00:02:48:06 - 00:02:51:27 as well, but it's very particular about how you mix and match them. 00:02:52:14 - 00:02:54:01 Data model, code generation. 00:02:54:01 - 00:02:57:11 Just did a simple conversion and shoved it into a text file. 00:02:57:19 - 00:03:00:24 So I had some problems and those had to be cleaned up by hand. 00:03:00:24 - 00:03:04:15 Data model code generator also does not validate regular expressions that are 00:03:04:18 - 00:03:05:22 created during conversion. 00:03:05:22 - 00:03:10:00 So for example, there's this regular expression up on the slide 00:03:10:10 - 00:03:14:07 which uses some Unicode, regular expression sequences 00:03:14:08 - 00:03:16:17 don't work in Python, so you have to translate them. 00:03:16:19 - 00:03:18:16 So that needs to be done by hand. 00:03:18:18 - 00:03:21:26 nevertheless, despite all of my complaining, it is possible 00:03:21:26 - 00:03:25:00 to produce a basic lightweight 00:03:25:13 - 00:03:29:26 library, for generation and validation of OSCAL data in Python, 00:03:30:05 - 00:03:33:24 which I have released as OSCAL-Pydantic v1. 00:03:33:24 - 00:03:37:26 So let's talk a little bit about, OSCAL and Python. 00:03:38:02 - 00:03:39:17 Here's an OSCAL data element. 00:03:39:17 - 00:03:41:12 This is the hash element. 00:03:41:12 - 00:03:46:15 It appears in back matter under the R link or relative link element. 00:03:46:15 - 00:03:51:03 It's designed to ensure that the integrity of remote resources can be verified. 00:03:51:03 - 00:03:53:24 If they're included as part of an OSCAL model. 00:03:53:24 - 00:03:58:19 A hash consists of two values, both of which, as you can see, are technically optional. 00:03:58:22 - 00:04:02:01 First is the algorithm used to define the hash. 00:04:02:09 - 00:04:05:20 It has to be expressed as a meta schema string. 00:04:06:07 - 00:04:09:07 You can see the reference to string next to algorithm, 00:04:09:13 - 00:04:13:11 and it's further constrained to a list of supported algorithms. 00:04:13:11 - 00:04:17:07 Although important to note, it says the value may be locally defined. 00:04:17:08 - 00:04:19:20 So actually there are no constraints. 00:04:19:20 - 00:04:22:05 Secondly, we have a value. 00:04:22:05 - 00:04:23:19 Which is the value of the hash. 00:04:23:19 - 00:04:25:21 So we've got an algorithm and a value. 00:04:25:21 - 00:04:28:21 This is also a meta schema string as you can see. 00:04:28:27 - 00:04:31:01 And it is also constrained. 00:04:31:01 - 00:04:34:21 The constraints depend on the value of the algorithm chosen. 00:04:34:21 - 00:04:39:12 Depending on which algorithm you choose that will dictate the format of the string 00:04:39:14 - 00:04:42:22 that you're allowed to put in the value of a hash. 00:04:42:28 - 00:04:45:11 In OSCAL-Pydantic v1, 00:04:45:11 - 00:04:47:20 You can see what this looks like. 00:04:47:20 - 00:04:51:00 This is pretty close to what you'd see in compliance. 00:04:51:05 - 00:04:54:05 So we use the same tooling to generate both models. 00:04:54:24 - 00:04:57:08 So you'll see a few important features. 00:04:57:08 - 00:05:00:13 First of all, the hash is a subclass of base model. 00:05:00:17 - 00:05:04:29 This is the basic model in Pydantic, so any additional requirements 00:05:04:29 - 00:05:07:29 such as extra.forbid, 00:05:08:08 - 00:05:11:24 which states that you cannot include any elements that are not part of 00:05:11:24 - 00:05:16:03 the model, have to be added for each of the models that you define. 00:05:16:08 - 00:05:19:08 The fields are defined as strings. 00:05:19:11 - 00:05:22:01 This is a basic Python data object. 00:05:22:01 - 00:05:24:22 It's just a string. There are no constraints. 00:05:24:22 - 00:05:29:07 Constraints are defined as regular expressions for each of the 00:05:29:09 - 00:05:30:16 attributes of the model. 00:05:30:16 - 00:05:34:18 You can see that the value doesn't have any constraints added at all. 00:05:34:18 - 00:05:35:15 It's just a string. 00:05:35:15 - 00:05:38:07 There are some challenges with this approach. 00:05:38:07 - 00:05:42:23 First of all, in my own work, while it was useful for me to be able 00:05:42:23 - 00:05:46:05 to produce a basic catalog in Python very quickly, 00:05:46:07 - 00:05:50:26 the auto generated schemas are a little bit tough to read and use. 00:05:50:26 - 00:05:55:16 Lots of use of route models and a lot of repetition in the code. 00:05:55:28 - 00:05:58:22 So for example, you'll see that regex for an OSCAL 00:05:58:22 - 00:06:01:23 meta schema string repeated over and over and over. 00:06:01:23 - 00:06:05:03 That makes it a little bit difficult to extend or customize 00:06:05:03 - 00:06:09:05 the libraries, you'll basically end up writing your own models from scratch 00:06:09:05 - 00:06:14:14 and you can create models that are valid but violate basic constraints of OSCAL 00:06:14:14 - 00:06:17:06 because you have to remember to include this from that 00:06:17:06 - 00:06:19:17 and that other from this other point in the schema. 00:06:19:17 - 00:06:23:03 And it's basically very complicated for somebody who wants to extend, So 00:06:23:03 - 00:06:26:10 in addition to those, there are also issues 00:06:26:10 - 00:06:29:22 that are inherited from the limitations of the Json schema. 00:06:30:08 - 00:06:32:25 We already talked about values 00:06:32:25 - 00:06:36:18 and how they can be formats, or they can include Regexes. 00:06:36:18 - 00:06:41:07 And it's not exactly clear how those two concepts are supposed to interact. 00:06:41:10 - 00:06:45:12 Specifications are, as far as I was able to tell, silent on it. 00:06:45:15 - 00:06:49:04 That can be overcome for relatively simple 00:06:49:07 - 00:06:53:25 use cases like making sure that a,string only contains credible characters, 00:06:53:28 - 00:06:57:01 there isn't really a way in Json schema 00:06:57:01 - 00:07:00:01 at all to define relationships between attributes. 00:07:00:01 - 00:07:05:06 So if we looked a the hash, if the algorithm is SHA-224, 00:07:05:06 - 00:07:09:16 then the value has to be a 28 character string, and it can only be 00:07:09:16 - 00:07:14:21 comprised of the values of x digits zero through nine, a through f. 00:07:14:24 - 00:07:20:05 There is no way in Json schema to express relationships between objects. 00:07:20:11 - 00:07:24:27 As a result, we had to go back to the drawing board for Pydantic v2. 00:07:25:05 - 00:07:28:09 First of all, OSCAL-Pydantic v2 leverages 00:07:28:09 - 00:07:31:09 Pydantic v2 versus Pydantic v1. 00:07:31:10 - 00:07:33:27 Pydantic v2 is fast, much faster. 00:07:33:27 - 00:07:38:15 17 times usually, but somewhere between four times and 50 times faster. 00:07:38:15 - 00:07:42:11 Rather than using tooling to automatically generate the models, we're 00:07:42:11 - 00:07:46:09 producing everything by hand, which makes the implementation slow here, 00:07:46:20 - 00:07:52:00 but hopefully makes the use of the library much easier. 00:07:52:05 - 00:07:55:18 So there will be a lot less repetition in the code. 00:07:55:22 - 00:07:59:22 It’s really designed for human beings to extend and customize. 00:08:00:11 - 00:08:04:06 It has a closer alignment with the underlying meta schema, 00:08:04:06 - 00:08:08:03 But remember that the Json schema is a subset of meta schema that leaves out 00:08:08:03 - 00:08:11:08 some important stuff, because there's just no way to express it in Json schema. 00:08:11:12 - 00:08:16:08 Of course, while this is in progress, the objective is complete 00:08:16:08 - 00:08:20:12 support for all of the validation rules expressed in OSCAL. 00:08:20:12 - 00:08:24:15 So let's look at what our hash looks like in OSCAL. 00:08:24:20 - 00:08:29:12 The first thing to notice is that we define something called an OSCAL string. 00:08:29:12 - 00:08:32:28 An OSCAL string is technically a meta schema string, 00:08:33:04 - 00:08:34:18 but we call it an OSCAL string. 00:08:34:18 - 00:08:38:11 and it encapsulates the central regular expression 00:08:38:11 - 00:08:41:11 for a string, which is basically, you can only have printable characters. 00:08:41:14 - 00:08:47:20 Then we use that data type elsewhere in our model definitions so that we inherit 00:08:47:24 - 00:08:51:08 those basic constraints and we don't have to repeat them in the code. 00:08:51:13 - 00:08:54:15 And also anybody who wants to extend and create a new OSCAL 00:08:54:16 - 00:08:58:17 object has all of the basic data elements that are available 00:08:58:17 - 00:09:01:27 and can use them and be comfortable that all of the restrictions are in place. 00:09:01:29 - 00:09:04:05 We can also use, none. 00:09:04:19 - 00:09:08:02 Recall that a hash, each of the elements are technically optional. 00:09:08:02 - 00:09:09:19 You can have 0 or 1. 00:09:09:19 - 00:09:12:29 So we allow you to have either an OSCAL string or none. 00:09:12:29 - 00:09:17:22 Our hash class is no longer a subclass of base model. 00:09:17:22 - 00:09:20:23 Instead, it subclasses something called OSCAL model. 00:09:20:29 - 00:09:23:28 OSCAL model is a subclass of the base model, but 00:09:23:28 - 00:09:27:13 it centralizes and consolidates a lot of the logic that you need. 00:09:27:15 - 00:09:31:23 With various aspects of translation, which we'll discuss in detail later. 00:09:31:25 - 00:09:35:24 notice that we are able to add additional constraints, on top. 00:09:35:24 - 00:09:39:08 So we can say, for example, it has to be an OSCAL string. 00:09:39:08 - 00:09:42:12 But of course it also has to be SHA- 00:09:42:15 - 00:09:46:16 224, 256, 384 or 512 or the SHA-3 variants. 00:09:46:16 - 00:09:49:23 Basically the objective is that by combining 00:09:49:23 - 00:09:53:15 all of these, Python and Pydantic features in a more sophisticated way, 00:09:53:24 - 00:09:58:10 we create something that's more, developer friendly at this layer, versus 00:09:58:10 - 00:10:01:23 something that is, generated by a machine and is really just designed 00:10:02:04 - 00:10:07:03 to allow you to very quickly produce a gigantic model from a Json schema. 00:10:07:03 - 00:10:07:29 Of course, 00:10:07:29 - 00:10:12:11 the important thing as well is that we are using, Pydantic, which allows you 00:10:12:11 - 00:10:17:26 to unlock the full power of Python so we can define arbitrarily complex. 00:10:17:26 - 00:10:19:09 restrictions and rules. 00:10:19:09 - 00:10:20:29 You can see here that we have 00:10:20:29 - 00:10:25:00 all of those restrictions, basically if you use SHA-224, 00:10:25:02 - 00:10:29:24 the value had better be a 28 character string that's only composed of X. 00:10:30:03 - 00:10:33:29 And we can, create that restriction any way that we want in Python. 00:10:33:29 - 00:10:35:22 Because it's Python. 00:10:35:22 - 00:10:40:01 So that really allows us to go to the next level, which is to define 00:10:40:05 - 00:10:44:22 relationships between elements and really capture the full schema that way. 00:10:44:26 - 00:10:49:11 So that's what one model looks like in Pydantic V2. 00:10:49:11 - 00:10:50:21 What I'd like to do, 00:10:50:21 - 00:10:53:29 understanding now the reasoning, and now that we've seen an example, 00:10:53:29 - 00:10:57:22 I'd like to take you on a very brief tour of the package itself. 00:10:58:17 - 00:11:01:20 So, the OSCAL-Pydantic package, 00:11:01:28 - 00:11:04:28 of course, at the top level has the models. 00:11:05:02 - 00:11:07:25 The models are the catalogs profiles, those 00:11:07:25 - 00:11:12:22 top level elements, that are primary what people will be interacting 00:11:12:22 - 00:11:15:04 with when they interact with this library, 00:11:15:04 - 00:11:17:04 Even if they are interacting as developers. 00:11:17:04 - 00:11:22:11 because extensibility was a core goal, we also have identified a subpackage 00:11:22:11 - 00:11:26:22 called core, which has the basic building blocks of all the models. 00:11:26:29 - 00:11:31:08 Mostly you can ignore it unless you need to extend any of the models. 00:11:31:18 - 00:11:33:06 Let's talk about the core models. 00:11:33:06 - 00:11:35:22 First we have, data types. 00:11:35:22 - 00:11:40:02 This encapsulates the core meta schema data types 00:11:40:02 - 00:11:44:08 like string and date time and URL and all of those things. 00:11:44:08 - 00:11:47:14 We have, something called the OSCAL model, 00:11:47:17 - 00:11:50:24 which is a subclass of Pydantic base model, 00:11:50:29 - 00:11:55:15 which encapsulates a lot of behavior that's relevant in OSCAL. 00:11:55:24 - 00:11:59:18 So, for example, in OSCAL, Json specifically loves 00:11:59:18 - 00:12:03:14 to have dashes in variable names or in element names. 00:12:03:17 - 00:12:07:23 Python does not like that because Python sees the dash as a minus sign. 00:12:07:27 - 00:12:10:19 And so Python likes to use snake case. 00:12:10:19 - 00:12:14:06 So there's just a function in there that will automatically translate 00:12:14:06 - 00:12:17:13 anything with dashes into the equivalent snake case. 00:12:17:17 - 00:12:20:27 There are elements of the OSCAL models that are called class. 00:12:21:09 - 00:12:24:23 Of course, class is a reserved word in Python because you use it 00:12:24:23 - 00:12:28:12 to document a Python class, which we use very heavily. 00:12:28:19 - 00:12:33:02 So there is a function in there that will automatically translate, 00:12:33:02 - 00:12:36:02 anything called a class into something else. 00:12:36:02 - 00:12:39:03 So you can define something called something else, 00:12:39:03 - 00:12:43:00 and it'll automatically be translated when you export into a class. 00:12:43:06 - 00:12:47:02 Those are some of the key attributes encapsulating that behavior. 00:12:47:08 - 00:12:49:01 We have some, friendly features. 00:12:49:01 - 00:12:51:19 When you dump Pydantic to Json, 00:12:51:19 - 00:12:55:03 by default, it will dump anything, including nulls, and we don't 00:12:55:04 - 00:12:56:00 we don't want those. 00:12:56:00 - 00:12:56:29 Why would we want nulls? 00:12:56:29 - 00:12:58:08 So right now. 00:12:58:08 - 00:13:00:28 if you use OSCAL model, it will automatically exclude nulls. 00:13:00:28 - 00:13:04:03 So you'll only get data elements where there's actually some value. 00:13:04:04 - 00:13:06:11 And it also pretty prints because I like pretty printing 00:13:06:11 - 00:13:07:26 because I always end up looking at the Json. 00:13:07:26 - 00:13:09:18 Now on to the interesting thing. 00:13:09:18 - 00:13:12:17 one of the ideas we're playing with, which we'll come back to later 00:13:12:17 - 00:13:16:06 in the presentation is, validation mechanisms. 00:13:16:14 - 00:13:20:14 And one of the ideas behind having a common OSCAL model 00:13:20:19 - 00:13:25:09 is that you can put a lot of the core validation logic in there, 00:13:25:12 - 00:13:29:06 which makes extension very easy because you don't have to write anything. 00:13:29:06 - 00:13:31:04 You just say these are the allowed attributes 00:13:31:04 - 00:13:32:14 and the relationship between them, 00:13:32:14 - 00:13:36:27 and it will be triggered within OSCAL model when you inherit from OSCAL model. 00:13:36:27 - 00:13:37:29 Other core modules. 00:13:37:29 - 00:13:41:06 There are a lot of elements like back matter or 00:13:41:06 - 00:13:44:07 metadata that appear in a lot of different models. 00:13:44:14 - 00:13:47:23 I've extracted all of those and put them in something called common, 00:13:48:02 - 00:13:50:25 which provides, access to those directly so that you can 00:13:50:25 - 00:13:53:25 instantiate them directly or incorporate them in other ways. 00:13:53:29 - 00:13:55:14 I also have something called properties. 00:13:55:14 - 00:14:01:11 Properties are also a common model, but they are a much more complex, 00:14:01:18 - 00:14:06:06 really are one of the key points of extensibility in OSCAL. 00:14:06:14 - 00:14:09:24 So they have their own module, the intention is that they're user 00:14:09:24 - 00:14:13:01 extensible, they implement the full set of existing constraints 00:14:13:10 - 00:14:16:04 and all of the documented OSCAL properties. 00:14:16:04 - 00:14:17:12 And we'll come back to that, 00:14:17:12 - 00:14:20:15 because that's an area where we're still in development. 00:14:20:20 - 00:14:23:18 Finally, we've got the core data models, 00:14:23:18 - 00:14:26:27 Right now I'm working on catalog because that's the one I need first. 00:14:26:27 - 00:14:29:06 So I'm selfish. So I'm going to start with that one. 00:14:29:06 - 00:14:33:20 But of course my belief is that once we get all the underlying data structures 00:14:33:20 - 00:14:36:27 and all of the libraries properly composed, 00:14:37:04 - 00:14:40:10 that it'll be relatively straightforward to add on system 00:14:40:10 - 00:14:43:10 security plan profile and tackle the rest of them one by one. 00:14:43:12 - 00:14:47:14 We will have all of the common elements defined and we'll have 00:14:47:14 - 00:14:51:13 worked out how to how to do things in extensible way with minimum boiler 00:14:52:00 - 00:14:54:15 this is not something special. 00:14:54:15 - 00:14:57:13 This is really pretty much just Pydantic, right. 00:14:57:13 - 00:15:00:21 So pydantic provides support for all of these functions. 00:15:01:02 - 00:15:04:20 We are just adding the additional restrictions that are relevant to OSCAL.. 00:15:04:28 - 00:15:08:14 Straightforward way to create an OSCAL object 00:15:08:14 - 00:15:12:09 with a library is just to create the object directly in Python. 00:15:12:16 - 00:15:13:24 From the core package, 00:15:13:24 - 00:15:16:22 you can import common. inside there is hash. 00:15:16:22 - 00:15:18:22 So then you can just create a hash. 00:15:18:22 - 00:15:20:15 So you can provide the algorithm. 00:15:20:15 - 00:15:21:29 You can provide the value 00:15:21:29 - 00:15:26:10 and it will automatically validate that those values are correct 00:15:26:17 - 00:15:28:09 and are related to each other in the right way. 00:15:28:09 - 00:15:31:13 So you get the full power of, the validated OSCAL, 00:15:31:17 - 00:15:34:20 including the relationship between the elements of the model. 00:15:34:25 - 00:15:37:16 Then, you know, you can see that you've created an object, 00:15:37:16 - 00:15:40:20 if you get a Python object somewhere else, 00:15:40:26 - 00:15:44:09 What's important here, is that, you can get it from some other source. 00:15:44:16 - 00:15:46:20 But basically you create the object somewhere else 00:15:46:20 - 00:15:50:01 and then say, oh, by the way, Pydantic, I'd like you to treat this as a hash. 00:15:50:08 - 00:15:53:09 And then that allows you to have all of the validation rules 00:15:53:09 - 00:15:58:00 and all of those other things based on data that comes from some other source. 00:15:58:04 - 00:16:02:27 Pydantic loves Json, so you can import Json directly. 00:16:02:27 - 00:16:05:29 So if you just have a Json string or you have a big catalog 00:16:06:02 - 00:16:07:05 please take a hash, 00:16:07:06 - 00:16:11:13 it's expressed as a Json string, please validate it and create a model. 00:16:11:18 - 00:16:15:11 And it will happily do that, provided that the data is of the correct format 00:16:15:11 - 00:16:16:21 and the contents work out. 00:16:16:21 - 00:16:19:21 And again, you have an object that you can print. 00:16:19:21 - 00:16:20:25 You can do whatever you need. 00:16:20:25 - 00:16:24:23 You can, build larger data structures with it, large OSCAL structures. 00:16:24:23 - 00:16:26:10 You can you can do whatever you need to. 00:16:26:10 - 00:16:27:23 Great. Json is fantastic. 00:16:27:23 - 00:16:29:06 What about Yaml or XML? 00:16:29:06 - 00:16:32:10 The nature of pydantic is that it really likes Json and doesn't care 00:16:32:10 - 00:16:35:17 much for anything else, some work is being done in that area 00:16:35:17 - 00:16:40:01 by third parties, but for right now, what I planned to do was use a separate 00:16:40:01 - 00:16:45:25 Python library to convert the Yaml or the XML into Python objects, 00:16:45:25 - 00:16:47:04 and then just import 00:16:47:04 - 00:16:51:03 those Python objects into Pydantic using the model validate call. 00:16:51:15 - 00:16:53:29 Some examples of what that will look like in practice. 00:16:53:29 - 00:16:56:19 Of course we started with hash, which is very simple. 00:16:56:19 - 00:16:58:10 It's a very minimal use case. 00:16:58:10 - 00:17:01:09 But obviously, if you were talking about something like a catalog, 00:17:01:09 - 00:17:04:21 there would be a ton of nested models, right? 00:17:04:21 - 00:17:06:24 Models within models, within models. 00:17:06:24 - 00:17:11:05 And that's what the entire Pydantic V2 library provides. 00:17:11:05 - 00:17:14:25 It so that you could take a catalog, you could say import as Json. 00:17:15:03 - 00:17:18:29 And the library will figure out what type of object, 00:17:18:29 - 00:17:22:03 a child of this it will import everything. 00:17:22:03 - 00:17:27:02 Give you very nice structured, python and force all of those constraints 00:17:27:02 - 00:17:28:02 that you'd like to see. 00:17:28:02 - 00:17:29:03 Final section: 00:17:29:03 - 00:17:31:22 again, don't forget that V1 is available today. 00:17:31:22 - 00:17:35:27 It's available in GitHub, and it's been published, in PyPI. 00:17:35:27 - 00:17:38:29 So you can just import pip install OSCAL-Pydantic 00:17:39:01 - 00:17:40:27 And you'll have the version one 00:17:40:27 - 00:17:45:02 which doesn't implement all of the constraints but is useful. 00:17:45:04 - 00:17:48:21 But of course, since it's open source, contributions are welcome. 00:17:48:21 - 00:17:49:19 And I would like to talk 00:17:49:19 - 00:17:52:19 about some of the areas where I would like to see some help. 00:17:52:23 - 00:17:55:11 The first thing is for me, the need to fix properties. 00:17:55:11 - 00:17:56:07 Properties are proving 00:17:56:07 - 00:18:00:00 to be very complicated to implement because, it’s relatively straightforward 00:18:00:07 - 00:18:05:27 to implement something that has a set of restrictions that nobody needs to extend. 00:18:05:27 - 00:18:09:03 Then you just create custom models, you create custom functions, 00:18:09:03 - 00:18:11:05 you use model validate or whatever you need to do. 00:18:11:05 - 00:18:12:18 But properties are special. 00:18:12:18 - 00:18:14:29 in addition to all the existing upscale properties, 00:18:14:29 - 00:18:18:10 first thing I'm going to want to do when I get a catalog working 00:18:18:15 - 00:18:21:17 is start defining my own custom properties. 00:18:21:17 - 00:18:24:21 I need to make sure that the properties that I define 00:18:24:27 - 00:18:28:05 and the way that I implement properties is something that's very easily 00:18:28:05 - 00:18:31:12 extensible, and I'm struggling to find the right way to do that. 00:18:31:13 - 00:18:33:00 That's very developer friendly. 00:18:33:00 - 00:18:36:22 We need to finish catalog and another thing that's very important 00:18:36:22 - 00:18:40:15 that I'll come back to, is creating things that are necessary 00:18:40:15 - 00:18:41:19 for proper test cases. 00:18:41:19 - 00:18:47:04 I think that that will be necessary for me, but useful for a lot of other people. 00:18:47:04 - 00:18:49:04 we're going to revisit that in a couple of minutes. 00:18:49:04 - 00:18:53:00 Of course I need help if you know Python, if you know Pydantic 00:18:53:00 - 00:18:54:24 and you're interested in chipping in, the thing I'm 00:18:54:24 - 00:18:57:15 struggling with the most right now really is properties. 00:18:57:19 - 00:18:58:28 From the perspective of OSCAL. 00:18:58:28 - 00:19:01:22 They have very complex validation rules. 00:19:01:23 - 00:19:07:02 There are numerous different types of properties, they're designed to be extensible. 00:19:07:04 - 00:19:11:09 if you read the specification, you can see that you can define your own namespace 00:19:11:09 - 00:19:14:18 and then you can define your own values, you can define, constraints 00:19:14:18 - 00:19:16:27 on the different elements within the property. 00:19:16:27 - 00:19:18:07 all of that's very important. 00:19:18:07 - 00:19:21:21 But as I said, the first thing I'm going to do is start creating my own properties. 00:19:21:26 - 00:19:26:21 So I need to make sure that the library allows developers 00:19:26:21 - 00:19:30:11 to extend properties in a very easy and consistent way. 00:19:30:23 - 00:19:35:14 So we have to provide full validation of all of the existing properties. 00:19:35:16 - 00:19:37:09 We have to support extension. 00:19:37:09 - 00:19:39:03 So that's what I'm currently working on. 00:19:39:03 - 00:19:42:24 The next thing, that I think, is very relevant for the developer 00:19:42:24 - 00:19:44:14 community is to talk about test cases. 00:19:44:14 - 00:19:48:06 I have already found scenarios where I spent a couple of days 00:19:48:06 - 00:19:52:15 fixing one bug, only to discover that I had broken ten other. 00:19:52:19 - 00:19:56:21 And so, of course, complex packages, if you want them to be 00:19:56:28 - 00:20:00:10 of a very high quality, you need to be able to integration testing, right? 00:20:00:10 - 00:20:02:04 You need unit tests, you need integration 00:20:02:04 - 00:20:03:20 tests, you need all of that sort of things. 00:20:03:20 - 00:20:05:07 And for that to happen, 00:20:05:07 - 00:20:10:03 we need a library of all skill artifacts that are valid and invalid. 00:20:10:03 - 00:20:13:19 I think it's relatively straightforward to take an existing catalog and, 00:20:13:19 - 00:20:17:26 using some XML parser or Json parser, break it into all of its sub elements, 00:20:18:05 - 00:20:21:06 and then you can you can have those represented as different use cases. 00:20:21:09 - 00:20:24:18 What's a little bit more complicated is coming up with invalid artifacts. 00:20:24:18 - 00:20:28:00 Because you might want to test invalid in a lot of different ways. 00:20:28:00 - 00:20:31:07 Of course you can set values to random and that will break things. 00:20:31:11 - 00:20:36:02 But what you want to find out is, if you say SHA-224 but then you include 00:20:36:02 - 00:20:39:20 a hash that only makes sense in the context of SHA-256. 00:20:40:05 - 00:20:44:04 Both of those elements are correct when examined independently, 00:20:44:10 - 00:20:45:23 but they don't work together. 00:20:45:23 - 00:20:50:04 Having a really strong library of things, that are valid. 00:20:50:04 - 00:20:53:27 Also, big things like catalog and little things like hashes, 00:20:53:27 - 00:20:57:13 and are expressed in Json, Yaml 00:20:57:20 - 00:21:01:13 and XML, and provide negative test cases, things 00:21:01:13 - 00:21:04:24 that should not work if a OSCAL library is operating properly. 00:21:04:24 - 00:21:06:00 that's quite complicated. 00:21:06:00 - 00:21:07:27 It's something I'm going to have to do. 00:21:07:27 - 00:21:09:16 For everyone on the call, 00:21:09:16 - 00:21:13:16 whether you're into Python or not, that will be a very important 00:21:13:16 - 00:21:17:15 capability to have for every project, regardless of the language 00:21:17:15 - 00:21:19:08 you're working in or the domain. 00:21:19:08 - 00:21:21:22 Regardless, I think everyone will benefit from test cases. 00:21:21:22 - 00:21:23:03 So it would be interesting 00:21:23:03 - 00:21:27:10 to talk to the community about how we can come together define 00:21:27:10 - 00:21:31:05 and have a library of standard test cases that can be imported and leveraged 00:21:31:12 - 00:21:35:14 as, is scaffolding to help people build libraries that work. 00:21:35:16 - 00:21:38:16 That really is the end of my presentation. 00:21:38:18 - 00:21:43:03 With the presentation being over, guess we can stop the recording and I can take questions.