Hi I’m Norman Bier, from Carnegie Mellon’s Open Learning Initiative. There’s been plenty said about the past and the present of Open Education so now let’s talk about the future; . Technology has been one of the major drivers of OER replicability w/o cost continues to be a necessary condition for OER to succeed, and many are looking to ongoing technological affordances to provide other ways mechanisms for transforming education. Most frequently I hear suggestions that technology can be transformative by letting studns learn anywhere and any time, or that tech can provide rich simulations and labs to students; others, especially in the OER space, continue to believe the access that technology enables will be transformational.

While these affordances are definitely important, at OLI we’ve felt that the technology’s greatest potential is in the ability to collect and apply data web-based courseware gives us the ability to embed assessment into every student learning opportunity, and to leverage the data capture from these interactions to fundamentally improve learning driving better hints and feedback to learners; offering faculty more focuses queues on where their students are struggling and where they’re succeeding. More broadly, we can interpret these interactions in ways that can continuously improve the design of our courses and in the aggregate, this information can feed back into the research community, advancing our understanding of how humans learn.

Collecting and using this learning data is controversial –especially given some of the recent data abuses and breaches and I’m hearing some leaders in our community calling for strong solutions, ranging from dramatically locking down data to renouncing the collection of data entirely. That’s an important conversation, but it’s not the one that I’m engaging in right now I think we are already in a world where data-driven approaches and materials are already being developed, adopted and embraced by vendors, by schools, by foundations and by government. The future is already here. And to be clear these approaches have already demonstrated an amazing potential improve outcomes, lower costs and enable new approaches and pedagogies. Optimists among us are right to look at these approaches as an opportunity to move the needle in education, especially among vulnerable learning populations. But the pessimists among also have real concerns, in that this seems to provide publishers with a sudden opportunity to reconsider business models as offering subscription services just at the time when real progress is being made on opening content.

My concerns are a little more direct effective use of data drive approaches rely on instrumented content, that is, content that can provide meaningful assessments of student learning. It relies on systems that can capture and store that data, on algorithms that can analyze and make sense of that learner data. And it relies on displays and dashboards that can make that analysis actionable and understandable. Vendors are out ahead in building out these systems and the courseware that drives them, and open education is getting left behind in this race, with a continuing emphasis on developing static open content, and without systems to leverage what instrumented content is open.

So even as we see a broader recognition and use of OER, we also see publishers moving upmarket into data-driven systems, leaving us with a real risk of open ending up w/ a 2nd class solutions and OER consumers –especially galling when you realize that many of these vendors are populating their systems with OER. When vendors are the primary drivers of this approach, it also leaves us with a reach danger of vendor lock-in and all of that data and content ends up stuck in proprietary systems if you leave or if they go out of business. But I’m even more concerned that in this model, the algorithms and models become secret, black box systems. With no visibility into the ways that data is being analyze or the ways that predictions and recommendations are being made, we and are learners become increasingly dependent upon systems that we don’t understand, that that actually might not be making particularly good suggestions or recommendations. Beyond the lack of direct visibility into these systems –that is, the ability to look into the code, there’s also remarkably little secondary visibility; research on the efficacy of these systems is few and far between. Too often what passes as evidence is actually marketing materials masquerading as research, and we’ve seen a few examples of what appeared to be rigorous research into analytics systems turn out to be…well…just wrong.

Absent transparency into the systems that power some of the most popular analytics systems, some educators have looked at the inputs and outputs of the system to try to reverse-engineer what claim to be sophisticated, statistical models, only to find that in some cases some simple averaging approaches can replicate the vendor’s results.

By itself, this should all be enough to make the case for the Open Education movement to set its sights on establishing and using fully open analytic systems, powered by fully open models and algorithms. At a basic level, these kinds of open systems can guarantee that Open Education can continue to compete with (and avoid being exploited by) closed, proprietary approaches. Equally important, such systems not only avoid the data lock-in problem, by establishing clear standards for data and portability, such systems are more likely to force vendors to provide more options for porting data into and out of their systems.

Most important, though, an open approach to these types of system provide full transparency into the workings of the algorithms and models that are driving recommendation. This is absolutely essential to understanding these system, and essential to making informed decisions on how you leverage and interpret the recommendations these systems make.

Equally important to me, though, is that this type of transparency is essential for any type of scientific exploration of the algorithms and models understanding how (or if) they work to begin with. Engaging in the difficult work of refining them and making them more useful. And perhaps most importantly, investigating the assumptions that are implicit in the algorithms and models. We’ve seen time and time again that the implicit biases of developers find their way into algorithms from the algorithms that power social media to the software the drives automatic hand dryers, we’ve seen these biases at work. To borrow from Audrey Watters, our algorithms are not neutral I can make every effort to identify and compensate for my biases as I develop my own algorithms, but like the bulk of an iceberg, by definition the bulk of our implicit biases are hidden from our own view. When I develop a model to to identify and predict learning, it’s pretty likely to prioritize and privilege patterns and behaviors that look like learning to me and to deemphasize motivations and actions that don’t’ align quite as well my experience. As proud as I am of the work and experiences that have lead me to Carnegie Mellon, I’m under no illusions that these necessariliy the types of common experiences and motivations upon which we should be basing our common understanding and models of learning. If none of us can fully identify and avoid their own biases, then we must have a systems that allow for broader exploration, identification and remediation of these biases. And the transparency that’s inherent in the open approach is the best way that I know to ensure this work can happen.

So, I believe that the next battle for the Open Education movement is in the data-driven education space, and I think that the creation of open systems, algorithms and models --- along with the instrumented OER that can drive it is absolutely critical. I think there’s some real urgency in making progress on this dimension, and I’m worried that too much of our attention continues to be on the development of static, open textbooks. And while we’ve focused on other things, proprietary solutions have made a enormous inroads in claiming the data-driven space for their own.

A quick note for those of you watching this as part of the EdX Open Education MOOC there’s been some great research that suggests that watching videos like this one are one of the worst ways for you to learn it’s much more effective for you to engage in richer, learn-by-doing activities. Hopefully Wiley and Siemens team have some great have some great activities lined up, but in case they don’t: I have some home work for you:

Take a quick look around google and identify some of the leading options for learning analytics and data-driven, instrumented courseware. What can you say about the efficacy of the solution they offer? Do you have any clarity on how their algorithms operate? Now, see what you can find on current work in open analytics and open learning models who’s doing the most interesting work? Are there systems that are currently adoptable? If not, why not what needs to happen before there are practical, open solutions in place? If you have a moment, take a look at some of the online conversations happening in response to the new Creative Commons Open Education Platform; Creative Common’s Cable Greene has been outspoken in promoting the need for open systems in this space; do you see this reflected in the platform’s current conversations? Finally, take a few minutes to read Audrey Watters’ essay “Identity, Power and Education’s Algorthim’s” and then make a list of the kind of biases you can imagine creeping into the kinds of algorithms we’re exploring. Which ones are the most concerning? How can an open approach help mitigate those concerns?

Hope you’ve found this interesting questions? Comments? Howls of disbelief & derision? let me know on twitter: @normanbier