Scrapy 爬虫框架
Scrapy 爬虫框架介绍 官网:https://www.scrapy.org/ 文档:https://docs.scrapy.net.cn/en/latest/ 快速功能强大的网络爬虫框架 Scrapy 的安装 pip install scrapy scrapy -h Scrapy 爬虫框架结构 Scrapy不是一个函数功能库,而是一个爬虫框架。 ...
Scrapy 爬虫框架介绍 官网:https://www.scrapy.org/ 文档:https://docs.scrapy.net.cn/en/latest/ 快速功能强大的网络爬虫框架 Scrapy 的安装 pip install scrapy scrapy -h Scrapy 爬虫框架结构 Scrapy不是一个函数功能库,而是一个爬虫框架。 ...
正则表达式 regular expression, regex, RE 正则表达式是用来简洁表达一组字符串的表达式 正则表达式是一种针对字符串表达“简洁”和“特征”思想的工具 正则表达式可以用来判断某字符串的特征归属 ...
信息标记的三种形式 信息的标记 标记后的信息可形成信息组织结构,增加了信息维度 标记的结构与信息一样具有重要价值 标记后的信息可用于通信、存储或展示 标记后的信息更利于程序理解和运用 ...
Beautiful Soup 库入门 官网:https://www.crummy.com/software/BeautifulSoup/ You didn’t write that awful page. You’re just trying to get some data out of it. Beautiful Soup is here to help. Since 2004, it’s been saving programmers hours or days of work on quick-turnaround screen scraping projects. Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Three features make it powerful: Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn’t take much code to write an application Beautiful Soup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don’t have to think about encodings, unless the document doesn’t specify an encoding and Beautiful Soup can’t detect one. Then you just have to specify the original encoding. Beautiful Soup sits on top of popular Python parsers like lxml and html5lib, allowing you to try out different parsing strategies or trade speed for flexibility. Beautiful Soup parses anything you give it, and does the tree traversal stuff for you. You can tell it “Find all the links”, or “Find all the links of class externalLink”, or “Find all the links whose urls match “foo.com”, or “Find the table heading that’s got bold text, then give me that text.” ...
https://python-requests.org/ Requests 库入门 安装:pip install requests 基本使用 python 1 2 3 4 5 6 import requests r = requests.get("http://www.baidu.com") r.status_code 200 r.encoding = 'utf-8' r.text ...
前言 事情的起因是语雀导出的 md 文件的图片自带防盗链,所以想把文章发往博客还得需要人工手动复制图片再上传一次。当文章的图片很多时,就会十分吃力。 所以本人借助 AI 做了这样的一个脚本。 ...
https://monyer.com/game/game1/ 第一关 https://monyer.com/game/game1/first.php java 1 2 3 4 5 6 7 8 9 <script type="text/javascript"> function check(){ if(document.getElementById('txt').value==" "){ window.location.href="hello.php"; }else{ alert("密码错误"); } } </script> 输入值为俩个空格就可以到下一关了 ...
参考文章: 若依最新版本4.8.1漏洞 SSTI绕过获取ShiroKey至RCE(全JAVA版本绕过,附带POC) https://mp.weixin.qq.com/s/4yi0UOTgBCsGK6J8qSz8tQ 某依最新版本稳定4.8.1 RCE (Thymeleaf模板注入绕过) ...
CVE-2022-22947 的起因是作者 @Wyatt 在 Bring Your Own SSRF – The Gateway Actuator 一文中提及到利用 Spring Cloud Gateway Actuator 构造 SSRF,之后该作者利用发现的暴露的 Actuator 执行器,在 CVE-2022-22947: SpEL Casting and Evil Beans 中讲到: /actuator/gateway/routes/创建路由并在 filters字段插入一个 SpEL 表达式,Spring Cloud Gateway 在处理过滤器时会执行该表达式,通过构造恶意 SpEL 可实现 RCE。 ...
Smartbi 远程代码执行漏洞复现(QVD-2025-31926) 1、 漏洞描述 近日,奇安信CERT监测到官方修复Smartbi 远程代码执行漏洞(QVD-2025-31926),该漏洞源于攻击者可通过默认资源ID绕过身份验证获取权限,配合后台接口实现远程代码执行,可能导致服务器被完全控制、数据泄露或业务系统沦陷。鉴于该漏洞影响范围较大,建议客户尽快做好自查及防护。 ...